Last year, one of our security researchers Mo discovered an unsafe deserialization vulnerability in Apache Struts. It turned out to allow a remote code execution and and it was also part of the default configuration for struts so this was a pretty high impact vulnerability.
Today, I'm going to show you how to find unsafe deserialization vulnerabilities using QL. You can see here that I've got a copy of QL for eclipse. I have a snapshot of struts from August last year which includes the vulnerability loaded up and I'm going to walk you through the process of finding that vulnerability with QL.
To start, I'm just going to look for all of the places in a code where we we potentially perform deserialization and I'm interested in calls where the thing that's being called has the name fromXML. We've got two results here and we can jump to the source code in the snapshot to see that these are both calls to fromXML. Now, this is not necessarily enough for these to be vulnerable. In addition to just doing the deserialization, the user has to be able to control the value that's being passed into from XML. This is a pretty typical type of problem. When you're doing variant analysis, you have some value that is potentially user controlled and you want to know if it reaches a dangerous operation.
Semmle provides a library that will allow you to do a dataflow analysis. I've just pulled that in with these imports here and that's what I'm going to use to try and decide for each of these fromxml calls is this is this something that's really vulnerable or are they okay. The first thing I need to do is is define a data flow configuration that tells us what are the sources that the user might be able to control and what are the sinks - in this case, the arguments to fromXML. There's a little bit of boilerplate here while I set this up. Then the kind of the information I need to provide here is what are the sources, so this is how I do that. And we're going to say something is a source if it's a kind of RemoteFlowSource. This is a class that's provided by Semmle, it covers a lot of your standard ways that that an end user can control a value in a java application. This includes things like all of the web server annotations and these sorts of things that you'll be familiar with. Of course, if you want to add your own or customize that then you can you can put whatever you want in here. Now I'm going to do the same for the sinks. We said earlier that the sinks are gonna be anything that gets passed into fromXML is potentially dangerous. Say something is a sink if there is a call fromXML and one of the arguments to that call is our sink.
I've written my configuration and now I can go ahead and actually use that in the query. What I'm going do here is get the flow config and the node source and a path node that is the sink. The only kind of condition I'm going to impose on these is that under the configuration that I gave, there is a flow from the source to the sink. Then I'm going to return the source and that says when the user clicks on this result, where should we send them. I'm additionally going to give the source and the sink and this tells us to kind of provide a bit of context for this result, show how data gets between these two things, and finally a message to display. I can now run that.
Okay, that's finished and we've got two results here. You can see the first one is in test code so I'll skip over that and we can take a look at the second result here. This is going to be the source as you can see request as an HTTP servlet request and and the input stream and that is something that's likely to be user controlled. Up here on the right, we've got this path Explorer. This actually shows us all of the steps that we go through to get from this source to the sink which is going to be the argument to fromXML.
We start with the input stream here, you can see it gets wrapped in a reader there and then it's passed into this toObject method. We move to the next stage, here we've got the parameter toObject on XStream handler and again going back we can see that handler is a is a content type handler. If you can send a request which is going to be handled by the XStream handler, then you can pass whatever data you want into this fromXML method and that will allow you to get remote code execution with an appropriately crafted request.
So that shows you how to write a simple query that will find vulnerabilities like this. You can see that it's easy to modify if you have your own sources. Maybe you've got a custom web server something like that or if you want to customize the sinks as well looking for other types of deserialization it's easy to do that as well. You can tweak this even further adding things like barriers for sanitization and things like that to really customize this as much as you want and make sure it gives you great results.