ASP and special characters

Bill Appledorf ( (no email) )
Thu, 21 May 1998 23:16:51 -0700

A few weeks ago someone posted a message here expressing concern about
capturing special characters input by users, transmitting these characters
in URL's, storing them in a database, and querying those values using ASP.

This person was also concerned that embedded single and double quotes in
user input pose a security risk because a malicious user might embed
destructive SQL in those fields. Modern databases do not allow this, and in
fact to allow it is considered a horrible bug, so I will not address this
issue.

Six characters pose special problems to ASP programmers, seven if you count
space (%20). These characters are & (%26), + (%2B), and ? (%3F) because they
have special meaning in URL parameters strings, and " (%22), ' (%27), and |
(%7C) because they have special meaning in SQL statements.

An ASP programmer must feel completely confident about four processes to be
able to handle these characters successully:

1. If you reference Request.Form and Request.QueryString fields containing
these characters in your code, between <% and %> delimiters, you will find
that these characters are encoded as %nn escape sequences.

2. If you reference these same fields in your HTML, between <%= and %>
delimiters, ASP decodes the escape sequences for you and displays them as
ASCII printable characters.

3. If you store these characters in your database as escape sequences, you
can use the escape sequences in queries, and you can use the escape
sequences in URL parameter strings.

4. If you reference special characters as escape sequences internally in
your code, you have to decode them manually in order for them to display
properly.

Given these facts, one way you can solve the problem of gnarly characters is
to store Request.Form and Request.QueryString fields without decoding them,
treat them as escape sequences in your code, and only decode them when you
need to display them.

There is a level of complexity, however that you must deal with in order for
this approach work. % (%25) characters are themselves encoded as escape
sequences when you use them in Request.QueryString and Request.Form fields.

For example, suppose you create a field in a form like so:

<INPUT TYPE="HIDDEN" NAME="ID" VALUE="<%= ID %>"

and suppose ID is a string containing the value

"%22Hello%22" .

Code that references Request.QueryString("ID") or Request.Form("ID"),
depending on whether you say GET or POST, will see the value

%2522Hello%2522

To decode this value properly, and to keep from concatenating strings of %25
sequences in your Request.QueryString and Request.Form fields if you use
them to assemble URL's for Response.Redirect statements, you need to say
Replace(Request.QueryString("<field name">), "%25, "%") before you
manipulate these fields.

I am posting this message in hopes it might help a kindred soul or two out
there. Other people may have devised other ways to handle these characters.
My bottom line design requirement is that users have to be able to type
whatever they want, and my code has to handle it transparently to them. This
method works for me. Perhaps it will work for you.

Bill Appledorf
billappledorf@usa.net
- - - - - - - - - - - - - - - - - - - - - -