Of data and snooping

A lot has been written and admonished about the recent revelation that the justice department has been collecting and mining telephone call data.  It’s difficult to separate the political tribalism from legitimate concern and objective public policy debate, but I wanted to write a bit from my perspective as a guy who’s worked on big data problems for a long time as well as someone who knows a little about the ethical problems inherent in crossing lines of privacy in the service of other objectives.  I will probably satisfy neither side of the sick polarity, but that’s a good thing because the resident paranoia is a spirit I avoid since it wastes untold energies and times.

If I were given the problem of tracking communication of bad guys – those who have done my country harm and would do it again, I would definitely look at known phone numbers and web sites for activity to and from them. Given the mass of data, I would put it into a large database – in form of a data warehouse and write some queries against it.  Those queries would find those numbers and ip addresses making contact with the numbers and web sites and probably branch out from there.  That is, bad guys know other bad guys and to some level of indirection it would make sense to find the whole family.  And .. when activity picked up, alarms would sound and people would do things.  Corporations have used such means to track anomalies for decades and use customer data (that is, yours and mine) to do so.  It’s helpful and effective and solving problems like that has fed many a family of those employed in the field.  That’s not evil and it’s even controversial as a raw data mining exercise.

Now, when linked to administrations in the current popular polarity, this becomes linked to special interest and questionable practice.  “Bad guys” can (and some say they do) include the enemies of such administrations, whether donkey or elephant, so flames of distrust are fanned and people are up in arms. It doesn’t matter who is in power, suspicion is the sentiment of the day and one’s enemy party is always trying to oppress and injure.

I also work in the area of messaging (e-mail, instant, you name it) and it may surprise some to know that work e-mail is the property of one’s employer.  I remember giving a talk at a user conference one time talking about the mining of such e-mail to determine expertise, a very valuable practice for a corporation and for its individuals as well (answering the question “I wonder who knows about X ??” where X is any discipline or body of knowledge with value).  My talk was borderline heckled by an individual whose expertise was – true confession live and in color – Monica Lewinsky jokes.   He did not want to be known by such expertise.  He could be fired or certainly demoted and his reputation sullied.

Now, I did not and still cannot vouch for his employer’s not going after such content in e-mail, but there is a practice of privacy in place in corporations that cannot be legitimately breached.  If an employer is known as someone who does such mining, it would be a mark against them to some extent and a declaration of a type of corporate environment governed by fear.

I would apply that same force of restraint to the government in this case.  Do I know that data isn’t being misused?  No, but I do know that if it is, there are forces that would come into play that would ruin the careers of those doing such things.

So the popular polarity will wage its incessant war of words and even amplify its distrust and make ever more malicious accusations and counter-accusations.  

But I’ll both trust in God and in natural forces of restraint.  

And I won’t be calling any terrorists.  Well, maybe a wrong number.