[b0iler] | Also, you are just sending the inputed values of parameters. What | about the names of the parameter (the $key variables)? They could | contain potentially dangerous XSS which is often printed to the | client. Also, user input (GPC) is not the only tainted data in a | script. Any data that comes from an outside source is potientally | dangerous. Files, databases, ENV variables, etc.. need to be | treated as if it contains the most clever tricks to evade your | filtering and protection schemes. Correct. And I've tried to say the same quite a few times on several securityfocus lists the last two years. We need to shift the focus away from _input_. Input is never trouble- some in itself. It first gets troublesome when put in a context in which it is interpreted in some way. And then again only when parts of it will not be interpreted as plain data, but as something else. As b0iler (whoever that is :) ) correctly states above, data from the inside may cause just as much trouble as data from the outside. And it may do so deep inside a multi-tier system, far from the web layer. It's when data is passed somewhere for interpretation that it gets troublesome. We should thus pay attention to the format of the data whenever we _pass_it_along_, rather than when we receive it from the outside. Web applications tend to pass data along all the time: * to database servers, often by concatenating the data with strings containing SQL constructs, or by using some kind of prepared statement mechanism (much better). * to shell command interpreters (yikes!). * to the OS by sending file names to file handling functions, host names to name resolutions libraries and so on. (a large amount of "so on" for the OS.) * to legacy systems written in some obscure language using some equ- ally obscure protocol. * to other web servers (B2B) using XML, URL parameters or whatever. * to other processes running on the same server, using some internally made protocol. * many, many, many more... * and last, but not least, to the web browser of the user. Which luckily is just another sub-system, covered by the same rule as the rest. And to repeat: "Data" is not only user input. It is anything, no mat- ter the source. Every system we pass data to has its own way of interpreting it, and the interpretation depends on context. Some examples: * when building strings containing SQL queries, the quote character may cause trouble if it appears prematurely in an SQL string con- stant. _Any_ data passed as part of an SQL statement _that_is _to_be_interpreted_as_a_string_constant_ will need to have quotes escaped in some way. (No, we can't generally forbid quotes. How would I be able to write "can't" a few words back if you forbid the quote?) And no, we can't generally escape quotes at input time either, because then they will look rather funny for the _other_ sub-systems, in which quotes have no special meaning (eg. a text file or the user's browser). For more on this, see another vuln-dev-mail of mine available here: http://shh.thathost.com/text/passing-data-03.txt * when talking to the OS, null-bytes may create confusion when pass- ing strings, as the OS (written in C, normally) treats the '\0' as a string terminator. Most "modern" languages do not. We'll gen- erally need to pay attention to null-bytes when talking to sub- systems written in C. The reason is generally that our view of the string will differ from the view taken by the OS. But there are other things as well. If we pass a _file_name_ to the OS, we may need to pay attention to slashes (and for some ob- scure OSes, backslashes) and double-dots as well, as they will switch context from _file_ to _directory_. And hundreds of other examples on how talking to one particular sub-function (eg. open()) of a sub-system (eg. the OS) will need careful handling of a selected set of characters. * and then comes the browser again. The HTML parser in the browser gives special meaning to < (tag start) , > (tag end) and & (character entity). And if inside those < and >, suddenly " and ' (both attribute value encapsulators) may have a special meaning too. We'll need to escape them somehow, so that they are not treated as special characters, but rather as plain characters. The correct way is to use HTML encoding (as most of you know). The wrong way (generally) is to replace the special characters with nothing. Imagine all the complaints you will get if you make a discussion forum for mathematicians, and disallow < and > ... It is generally _not_possible_ to fetch data from the request and start by doing something to it that will match all the possible sub- systems in one go. Not without giving severe restrictions as to what the data may contain. ("Sorry, Sinead, but your name will have to be OConnor for now"). And not without introducing strange appearances for some of the sub-systems. ("Welcome, Sinead O\'Connor"). Input validation has been given _far_ to much focus. It may be good as a first measure, to be able to give users nice feedback when data don't match the business rules and other high level rules ("the file name is not supposed to contain directory elements"), but it generally won't solve the low level problems. In systems over toy size, data is passed between many different sub-systems, which often have different meta-characters that may be abused. People who believe that input validation at the web layer will avoid security problems several lay- ers down below (or when data come back to the first layer again), have given the issue too little thought, IMNSHO. Focus on input validation, but focus even more on handling every poss- ible meta-character, meta-byte, meta-word or whatever before passing the data along to the next sub-system, whatever that is. And that rule goes for every layer of the application, not just the web layer. Sverre - who feels this discussion would fit better at webappsec than at vuln-dev. -- shhat_private Computer Geek? Try my Nerd Quiz http://shh.thathost.com/ http://nerdquiz.thathost.com/
This archive was generated by hypermail 2b30 : Wed Oct 16 2002 - 14:24:47 PDT