The Golden State Murder Suspect’s DNA Was on a Public Database, and Yours Could Be, Too.

Remember when we all got a little scared that genetic testing companies might be forced to turn your data over to law enforcement ? Ah, there were simpler times. This week, police discovered a Golden State man suspected of murder, thanks in part to a relative’s sample from a publicly available DNA database. The same one that your relatives may already be in.

For this type of use, it doesn’t matter what the privacy policy says or whether the company wants to transfer your data. None of the major DNA companies have provided any data, and you can see many of their claims in this Buzzfeed story, which is still being updated . Instead, lead researcher Paul Howles said his team processed the files through a publicly available database called GEDmatch .

Many people have voluntarily uploaded their DNA to GEDmatch and other databases, often with real names and contact information. This is what you do if you are an adopted child looking for a long-lost parent, or a genealogy lover wondering if you have any cousins ​​still living in the old country. GEDmatch requires you to make your DNA data publicly available if you want to use their comparison tools, although you don’t need to provide your real name. And it’s not the only database that has helped law enforcement track people down without their knowledge.

How DNA Databases Help Track People

We do not know exactly which samples or databases were used in the Golden State Killer case; The Sacramento County District Attorney’s Office provided very little information and did not confirm any additional details. But some things are possible.

Y-chromosome data can help to guess the last name of an unknown person .

Cis males usually have an X and a Y chromosome, while cis females have two X chromosomes. This means that the Y chromosome is passed on from genetic males to their offspring – for example, from father to son. Since surnames are also often inherited, in many families you share a surname with anyone who has your Y chromosome.

A 2013 Science article describes how a small amount of Y chromosome data should be sufficient to determine the last names of approximately 12 percent of white males in the United States. (This method will find the wrong last name in 5 percent, and the rest will return as unknown.) The authors warn that the more people upload their information to public databases, the more likely it is to succeed.

This is exactly the method that genealogy consultant Colleen Fitzpatrick used to narrow down the cold case in Arizona . It seems she used the data of short tandem repeat (STR) of the Y-chromosome of the suspect to search the DNA database of family trees, and in the results she saw Miller’s name.

The police already had a long list of suspects in the Arizona case, but based on this information, they settled on one of the suspects named Miller. As with the Golden State killer, police confirmed the DNA match by obtaining a fresh DNA sample directly from their facility – the Sacramento office said they got it from something he threw away. (Yes, it’s legal , and it can be as common as a used drinking straw.)

Authors of Science Papers indicate that last name, place and year of birth are often sufficient to find a person in a given census.

SNP files can find pedigrees .

When you upload your raw data after being mailed in 23andme or Ancestry, you get a list of locations in your genome (called SNPs for single nucleotide polymorphisms) and two letters indicating your status for each one. For example, in a particular SNP, you might inherit A from one parent and G from the other.

Genetic testing sites will have tools to compare your DNA to others in their database, but you can also upload your raw data and submit it to other sites including GEDmatch or Family Tree DNA. (23andme and Ancestry will allow your data to be downloaded, but will not accept downloads.)

But you don’t need to send a saliva sample to one of these companies to get the raw data file. The DNA Doe project describes how they sequenced the entire genome of an unidentified girl from a cold case and used that data to create an SNP file to upload to GEDmatch. They found someone with enough of the same SNP that they were likely close relatives. This cousin also had an Ancestry account, where they completed a family tree with details of their family members. The tree bore a note about a cousin of the same age as the unidentified girl, and whose date of death was listed as “missing – presumably dead”. It was her .

Your DNA is not only yours

When you submit a saliva sample or upload a raw data file, you can only think about your privacy. “I have nothing to hide,” you can tell yourself. What difference does it make if someone finds out that I have blue eyes or a predisposition to heart disease?

But half of your DNA belongs to your biological mother and half to your biological father. The other half – in a different way – belongs to each of your children. On average, you share half of your DNA with your brother or sister, and a quarter with your stepbrother or sister, grandparent, aunt, uncle, niece, or nephew. You share the eighth with your cousin and so on. The more members of your extended family are involved in genealogy, the more likely it is that your DNA will be in a public database already provided by a relative.

In the cases we mention here, the breakthrough came when DNA was matched through a public database with the person’s real name. But your DNA is, in a sense, your most identifying information.

In some cases, it may not matter if your name is attached. Facebook reportedly spoke with the hospital about sharing anonymous data . They didn’t need names because they had enough information and good enough algorithms to think they could identify people based on everything else. (Facebook does not currently collect DNA information, thank goodness. There is a public DNA project that signs people using the Facebook app , but they say they don’t feed the data to Facebook itself.)

Remember the 2013 study on finding the last names of people? They took data on the complete genome from several famous people who made their data public, and showed that DNA files are sometimes enough to trace a person’s full name. Perhaps DNA cannot be completely anonymous .

Can You Protect Your Privacy While Using DNA Databases?

If you are very concerned about privacy, it is best not to use any of these databases. But you have no control over whether your relatives use them, and perhaps you are looking for a long-lost family member and therefore want to be in the database while minimizing the risks.

Here are some steps that can help maintain your privacy:

  • Don’t use your real name . This complicates genealogy for both police officers and legal users, so you may limit your ability to find a relative who recognizes your side of the family by your name. But if that’s not a problem, you can register on websites under an assumed name or just with your initials. (Terms of use, of course.)
  • Create a new email address if you want people to be able to contact you. Otherwise, your fictitious name hides little.
  • Set personal data or look and then delete . It won’t help if you’re hoping someone will contact you, but if you just want to see who’s already there, they can give you a snapshot. Upload your details (or register with the mapping system, depending on your database) and see who shows up. Then get out of there and cover your tracks.
  • Upload your raw data and delete your account . Of course, it’s handy that DNA services preserve your information, but you can also download the raw data as soon as it’s available. Then save that and ask the company to permanently delete your account and (if possible) your real saliva sample.
  • Think about which company you are using . Ancestry shared 31 of 34 law enforcement requests in 2017 . 23andme says they are grappling with every request and still haven’t passed anyone’s data. I’m not saying I trust 23andme, but I know where I’d better send my saliva.

Update 04/27/2018 3:20 PM: We originally wrote that the police “found the Golden State killer.” In fact, they found the person suspected of murdering Golden State, which is not the same thing, and we are sorry for the mistake.


Leave a Reply