Theres always more stuff to find on Google. And like all programs, if you actually read the manual, you can do things you didnt even imagine.
A couple of Google features have been sparking interest lately on security mailing lists. Both of the features rely on users leaving sensitive information out where Google can find it.
Once Google knows about it, you can use advanced search features to dig up the information. Its issues such as these that remind me that the weakest security link is usually the one between the users ears.
The first one I saw is the numerical-range search feature. You can search for numbers between a low and high value. For example, a search for “100..199 Madison Ave.” will find any instance between “100 Madison Ave.” and “199 Madison Ave.,” including “156 Madison Ave.”
This being the Internet, the next logical step was to search for credit card numbers. Try this one, for “Visa 4366000000000000..4366999999999999”.
But dont go rushing off to Amazon.com just yet to rip off the poor schnook in the Google entry. A lot of the numbers you find in entries like this are fake—test data for software developers, for instance. But Im sure some of them are real, too. Somebody was careless with that data.
But the other feature is my favorite. You can search for file by file type, meaning file extension. Want to search for all Adobe Acrobat files containing the phrase “foreign car repair”? Try this link: “foreign car repair” filetype:PDF”. Thats about 78 hits there!
Google actually can search 12 non-HTML formats including Microsoft Office, PostScript, Corel WordPerfect, Lotus 1-2-3 and, as we have seen, PDF. And you can mess with the query a little to search simply for a file of a particular type, e.g. “QDF filetype:QDF”.
Yes, that will show you Quicken data files that users have helpfully put up on the Web. The vast majority of these appear to be sample files for books and such, but I havent looked at all of them. Id bet there are some real ones.
And Ive probably seen worse. I wont get into any more details, but by searching for important file types, I found sensitive budget information for a major media company. My jaw still hurts from hitting the floor.
Of course, theres no problem here with Google. Theres a problem with users and administrators putting sensitive data out where Google can find it. Some of the files I saw appeared to be on users member sites for their ISP accounts. I suppose this is supposed to be a poor mans remote-access method, in that they can get to important files through the Web page. Oy, what a bad idea!
In fact, you might want to do some creative searching of your own sites (using the “site:mysite.com” search modifier) to get a sense of what Google has on you. If you find the wrong things up there, you can remove it following Googles instructions.
And dont rely on password protection to make it “safe” to put those files up there. I guarantee you that for any significant file format, theres an easily available crack program to break the password protection, and I know this is the case for Quicken and QuickBooks data.
So, keep your sensitive data off the Web, even for brief periods of time. Even if there are ways to keep Google off it, its not worth the chance that some crawling program will find it and grab the contents.
Security Center Editor Larry Seltzer has worked in and written about the computer industry since 1983.
Be sure to add our eWEEK.com security news feed to your RSS newsreader or My Yahoo page:
More from Larry Seltzer