I am fortunate to have become a physician in a time of great scientific progress. Back when I was in college and medical school, the thought that we would one day be able to sequence the human genome (and now sequence hundreds of cancer genomes), to measure the expression of every gene in the genome simultaneously on a single “gene chip,” and to assess the relative abundance of every RNA transcript, coding and noncoding (such as microRNAs) simultaneously through next generation sequencing (NGS) techniques was considered, if not science fiction, so far off in the future as to be unlikely to impact medicine in my career. Yet here I am, mid-career, and all of these are a reality. The cost of rapidly sequencing a genome has plummeted. Basically, the first human genome cost nearly $3 billion to sequence, while recent developments in sequencing technology have brought that cost down to the point where the “$1,000 genome” is within sight, if not already here, as illustrated in the graph above published by the National Human Genome Research Institute. Whether the “$1,000 genome” is truly here or not, the price is down to a few thousand dollars. Compare that to the cost of, for instance, the OncoType DX 21-gene assay for estrogen receptor-positive breast cancer, which costs nearly $4,000 and is paid for by insurance because its results can spare many women from even more expensive chemotherapy.
So, ready or not, genomic medicine is here, whether we know enough or not to interpret the results in individual patients and use it to benefit them, so much so that President Obama announced a $215 million plan for research in genomic mapping and precision medicine known as the Precision Medicine Initiative. Meanwhile, the deeply flawed yet popular 21st Century Cures bill, which passed the House of Representatives, bets heavily on genomic research and precision medicine. As I mentioned when I discussed the bill, it’s not so much the genomic medicine funding that is the major flaw in the bill but rather its underlying assumption that encouraging the FDA to decrease the burden of evidence to approve new drugs and devices will magically lead to an explosion in “21st century cures,” the same old antiregulatory wine in a slightly new bottle. Be that as it may, one way or the other, the federal government is poised to spend lots of money on precision medicine.
Because I’m a cancer doctor, and, if there’s one area in medicine in which precision medicine is being hyped the hardest, it’s hard for me not to think that the sea change that is going on in medicine really hit the national consciousness four years ago. That was when Walter Isaacson’s biography of Steve Jobs revealed that after his cancer had recurred as metastatic disease in 2010. Jobs had consulted with research teams at Stanford, Johns Hopkins, and the Broad Institute to have the genome of his cancer and normal tissue sequenced, one of the first twenty people in the world to have this information. At the time (2010-2011), each genome sequence cost $100,000, which Jobs could easily afford. Scientists and oncologists looked at this information and used it to choose various targeted therapies for Jobs throughout the remainder of his life, and Jobs met with all his doctors and researchers from the three institutions working on the DNA from his cancer at the Four Seasons Hotel in Palo Alto to discuss the genetic signatures found in Jobs’ cancer and how best to target them. Jobs’ case, as we now know, was a failure. However much Jobs’ team tried to stay one step ahead of his cancer, the cancer caught up and passed whatever they could do.
That’s not to say that there haven’t been successes. For instance, in 2012 I wrote about Dr. Lukas Wartman, at the time a recently-minted oncologist who had been diagnosed with acute lymphoblastic leukemia as a medical student, was successfully treated, but relapsed five years later. He underwent an apparently successful bone marrow transplant, but recurred again. At that point, there appeared to be little that could be done. However, Dr. Timothy Ley at the Genome Institute at George Washington University decided to do something radical. He sequenced the genes of Wartman’s cancer cells and normal cells:
The researchers on the project put other work aside for weeks, running one of the university’s 26 sequencing machines and supercomputer around the clock. And they found a culprit — a normal gene that was in overdrive, churning out huge amounts of a protein that appeared to be spurring the cancer’s growth.
That was 2011 as well. Today, the sequence could have been done much more rapidly. In any case, Ley identified a gene that was overactive and could be targeted by a new drug for kidney cancer. His cancer went into remission. Wartman is now the assistant director of cancer genomics at Washington University.
The technology now, both in terms of sequencing and bioinformatics, has advanced enormously even since 2011. With it has advanced the hype. But how much is hype and how much is really hope? Let’s take a look. Also, don’t get me wrong. I do believe there is considerable promise in precision medicine. However, having personally begun my research career in the 1990s, when angiogenesis inhibitors were being touted as the cure to all cancer (and we know what happened there), I am also skeptical that the benefits can ever live up to the hype.
The origin of “precision” medicine
“Precision medicine” is now the preferred term for what used to be called “personalized medicine.” From my perspective, it is a more accurate description of what “personalized medicine” meant, given that many doctors objected to the term because they felt that every good doctor practices personalized medicine. Even so, “precision medicine” is no less a marketing term than was “personalized medicine.” If you don’t believe this, look at the hype on the White House website:
Today, most medical treatments have been designed for the “average patient.” In too many cases, this “one-size-fits-all” approach isn’t effective, as treatments can be very successful for some patients but not for others. Precision medicine is an emerging approach to promoting health and treating disease that takes into account individual differences in people’s genes, environments, and lifestyles, making it possible to design highly effective, targeted treatments for cancer and other diseases. In short, precision medicine gives clinicians new tools, knowledge, and therapies to select which treatments will work best for which patients.
If you think this sounds like what alternative medicine quacks (but I repeat myself) routinely say about “conventional medicine,” you’d be right. It’s not that precision medicine advocates don’t have a germ of a point, but they fail to put it this criticism into historical context. Medicine has always been personalized or “precision.” It’s just that in the past the only tools we had to personalize our care were things like family history, comorbid conditions, patient preferences, and aspects of the patient’s history that might impact which treatment would be most appropriate. In other words, our tools to personalize care weren’t that “precise,” making our precision far less than we as physicians might have liked. Genomics and other new sciences offer the opportunity to change that, but at the cost that too much information will paralyze decision making. Still, at its best, precision medicine offers the opportunity to “personalize” medicine in a science-based manner, rather than the “make it up as you go along” and “pull it out of my nether regions” method of so many alternative medicine practitioners. It could also offer the clinical trials tools to do it, such as NCI-MATCH. At its worst, precision medicine is companies jumping the gun and selling genomic tests direct to the consumer without having an adequate scientific basis to know what they mean or what should be done with the results.
In any case, up until 2011, the term “personalized” medicine tended to be used to describe a form of medicine not yet in existence in which the each patients’ unique genomic makeup would serve as the basis to guide therapies. Then, the National Academy of Sciences Committee issued a report, “Toward Precision Medicine: Building a Knowledge Network for Biomedical Research and a New Taxonomy of Disease“, which advocated the term “precision medicine” and differentiated it from “personalized medicine” thusly:
“Personalized medicine” refers to the tailoring of medical treatment to the individual characteristics of each patient. It does not literally mean the creation of drugs or medical devices that are unique to a patient, but rather the ability to classify individuals into subpopulations that differ in their susceptibility to a particular disease or their response to a specific treatment. Preventive or therapeutic interventions can then be concentrated on those who will benefit, sparing expense and side effects for those who will not. (PCAST 2008) This term is now widely used, including in advertisements for commercial products, and it is sometimes misinterpreted as implying that unique treatments can be designed for each individual. For this reason, the Committee thinks that the term “Precision Medicine” is preferable to “Personalized Medicine” to convey the meaning intended in this report.
As I said, “precision medicine” is a marketing term, but it’s actually a better marketing term than “personalized medicine” because it is closer to what is really going on. That’s why I actually prefer it to “personalized medicine,” even though I wish there were a better term. Whatever it is called, however, the overarching belief that precision medicine is the future of medicine has led to what has been called an “arms race” or “gold rush” among academic medical centers to develop precision medicine initiatives, complete with banks of NGS machines, new departments of bioinformatics and genomics, and, of course, big, fancy computers to analyze the many petabytes of data produced, so much data that it’s hard to have enough media upon which to store it and we don’t know what to do with it. Genomic sequencing is producing so much data that IBM’s Watson is being used to analyze cancer genetics. It’s not for nothing that precision medicine is being likened to biology’s “moon shot“—and not always in a flattering way.
So what is the real potential of precision medicine?
I discussed some of the criticism of precision medicine when I discussed the 21st Century Cures Act three weeks ago. I’ll try to build on that, but after a brief recap. Basically, I mentioned that I was of a mixed mind on the bill’s emphasis on precision medicine, bemoaning how now, at arguably the most exciting time in the history of biomedical research, the dearth of funding means that, although we’ve developed all these fantastically powerful tools to probe the deepest mysteries of the genome and use the information to design better treatments, scientists lack the money to do so. I even likened the situation to owning a brand new Maserati but there being no gasoline to be found to drive it, or maybe having the biggest, baddest car of all in the world of Mad Max but having to fight for precious gasoline to run it. I also noted that I thought precision medicine was overhyped (as I am noting again in this post), referencing skeptical takes on precision medicine in recent op-eds by Michael Joyner in The New York Times, Rita Rubin in JAMA declaring precision medicine to be more about politics, Cynthia Graber in The New Yorker, and Ronald Bayer and Sandro Galea in The New England Journal of Medicine. Basically, the number of conditions whose outcome can be greatly affected by targeting specific mutations is relatively small, far smaller than the impact likely would be from duller, less “sexy” interventions, such as figuring out how to get people to lose weight, exercise more, and drink and smoke less. The question is whether focusing in the genetic underpinnings of disease will provide the “most bang for the buck,” given how difficult and expensive targeted drugs are to develop.
Over the weekend, there was a great article in The Boston Globe by Sharon Begley entitled “Precision medicine, linked to DNA, still too often misses“, that gives an idea of just how difficult reaching this new world of precision medicine will be. It’s the story of a man named John Moore, who lives in Apple Valley, UT. Moore has advanced melanoma and participated in a trial of precision medicine for melanoma. His outcome shows the promise and limitations of such approaches:
Back in January, when President Obama proposed a precision medicine initiative with a goal of “matching a cancer cure to our genetic code,” John Moore could have been its poster child. His main tumors were shrinking, and his cancer seemed to have stopped spreading because of a drug matched to the cancer’s DNA, just as Obama described.
This summer, however, after a year’s reprieve, Moore, 54, feels sick every day. The cancer — advanced melanoma like former president Jimmy Carter’s — has spread to his lungs, and he talks about “dying in a couple of months.”
The return and spread of Moore’s cancer in a form that seems impervious to treatment shows that precision medicine is more complicated than portrayed by politicians and even some top health officials. Contrary to its name, precision medicine is often inexact, which means that for some patients, it will offer false hope rather than a cure.
On the other hand, in the Intermountain study, after two years, progression-free survival in the group with advanced cancer treated using precision medicine techniques was nearly twice what it was in those who underwent standard chemotherapy, 23 months versus 12 months. Moore himself reports that with a pill he had one year of improved health and quality of life before his cancer started progressing again. It’s not yet clear in this trial whether this will translate into an improvement in overall survival, the gold standard endpoint, but it’s a very promising start. It is, however, not a miraculous start.
Here’s the problem. I’ve alluded to it before. Cancer genomes are messed up. Really messed up. And, as they progress, thanks to evolution they become even more messed up, and messed up in different ways, so that the tumor cells in one part of a tumor are messed up in a different way than the tumor cells in another part of the tumor, which are messed up in a different way than the metastases. It’s called tumor heterogeneity.
Now enter the problem in determining which mutations are significant (commonly called “driver” mutations) and which are secondary or “just along for the ride” (commonly called “passenger” mutations):
But setbacks like Moore’s show that genetic profiling of tumors is, at this point, no more a cure for every cancer than angiogenesis inhibitors, which cut off a tumor’s blood supply, or other much-hyped treatments have been.
A big reason is that cancer cells are genetically unstable as they accumulate mutations. As a result, a biopsy might turn up dozens of mutations, but it is not always clear which ones are along for the ride and which are driving the cancer. Only targeting the latter can stop a tumor’s growth or spread.
Knowing which mutation is the driver and which are passenger mutations is so complicated that the Intermountain researchers established a “molecular tumor board” to help.
Composed of six outside experts in cancer genomics, the board meets by conference call to examine the list of a patient’s tumor mutations and reach a consensus about which to target with drugs. Tumor profiling typically finds up to three driver mutations for which there are known drugs, and the board reviews data on how well these drugs have worked in other patients with similar tumors.
The next difficulty, Nadauld said, is that “the mutations may be different at different places in a tumor.” But oncologists are reluctant to perform multiple biopsies. The procedures can cause pain and complications such as infection, and there is no rigorous research indicating how many biopsies are necessary to snare every actionable mutation.
But a cancer-driving mutation that happens to lie in cells a mere millimeter away from those that were biopsied can be missed. Similarly, cancer cells’ propensity to amass mutations means that metastases, the far-flung descendants of the primary tumor, might be driven by different mutations and therefore need different drugs.
Or, as I like to say: Cancer is complicated. Really complicated. You just won’t believe how vastly, hugely, mind-bogglingly complicated it is. I mean, you may think it was tough to put a man on the moon, but that’s just peanuts to curing cancer, especially metastatic cancer. (Apologies to Douglas Adams.) Because of this, precision medicine as it exists now can lead to what Dr. Don S. Dizon calls a new kind of disappointment when genomic testing fails to identify any driver mutations for which targeted drugs exist because “discovery is an ongoing process and for many, we have not yet discovered the keys that drive all cancers, the therapies to address those mutations, and the tools to predict which treatment will afford the best response and outcome—an outcome our patients (and we) hope will mean a lifetime of living, despite cancer.”
None of this is to say that precision medicine can’t be highly effective in cancer. I’ve already described one patient for whom it was. It’s also important to consider that even extra year of life taking a pill with few side effects is “not too shabby,” either, if the alternative is death a year sooner. Prolonging life with good quality is a favorable outcome, even if the patient can’t be saved in the end.
What is precision medicine, anyway?
As I thought about precision medicine during the writing of this post, one thing that stood out to me is that, although precision medicine is rather broadly defined, in the public eye (and, indeed, in the eyes of most physicians and scientists) its definition is much narrower. This narrower definition of precision medicine is the sequencing of patient genomes in order to find genetic changes that can be targeted for treatment, predict the response to therapy of various pharmaceuticals or dietary interventions, or predict disease susceptibility. In other words, it’s all genomics, genomics, genomics, much of it heavily concentrated in oncology. (I concentrated on oncology for this post because it is what I know best.) If you reread the definition from the National Academy of Sciences Committee report, you’ll see that precision medicine is defined much more broadly. Other similar definitions include metabolomics, environmental factors and susceptibilities, immunological factors, our microbiome, and many more, although even a recent editorial in Science Translational Medicine emphasized genomics more than other factors.
In fact, in the most recent JAMA Oncology, there are two articles, a study and a commentary, examining the effect of precision medicine in breast cancer. What is that “precision medicine”? It’s the OncoType DX assay, which is generically referred to as the 21 Gene Recurrence Score Assay.
Basically, this assay is used for estrogen receptor-positive (i.e., hormone-responsive) breast cancer that has not yet spread to the axillary lymph nodes. Twenty-one different genes related to proliferation, invasion, and other functions are measured, and an empirically derived formula is used to calculate a “recurrence score.” Scores below 18 indicate low risk of recurrence as metastatic disease and insensitivity to chemotherapy. Patients with low scores generally receive hormonal therapy but not chemotherapy. Scores over 30 indicate high risk and greater sensitivity to chemotherapy. For such patients, chemotherapy and hormonal therapy are recommended. Patients who score in the “gray” area from 18-30 remain a conundrum, but clinical trials are under way to better define the cutoff point for a chemo/no chemo recommendation. In any case, this study indicates that the use of OncoType DX is associated with decreased use of chemotherapy but because of limitations in the Surveillance, Epidemiology, and End Results (SEER) data set with linked Medicare claims, it wasn’t clear whether this decline was in appropriate patients. In any case, there’s no reason why genomic tests (like the Oncotype DX test) that are rapidly proliferating shouldn’t be considered “precision medicine,” and they are in practice already. Contrary to the image of oncologists wanting to push that poisonous chemotherapy, OncoType DX was designed with the intent of decreasing chemotherapy use in patients who will not benefit. Imagine that.
Conclusion: Medicine that works is just medicine
In the end, I don’t really like the term “precision medicine” that much. It seems to be a term that reminds me, more than anything, of Humpty Dumpty’s famously scornful boast, “When I use a word, it means just what I choose it to mean—neither more nor less.” It’s a sentiment that definitely seems to apply to the term “precision medicine.” To me, when new tests or factors that predict prognosis or response to therapy or suggest which therapies are likely to be most effective are developed and validated, it’s an artificial distinction to link them to genomics, proteomics, or whatever, as well as “big data” and refer to them as “precision medicine.” To me, medicine that works is just “medicine.”
803 replies on “Precision medicine: Hype over hope?”
“Precision medicine” is an unfortunate term for a variety of reasons, among which is that “precision” is not the same thing as “accuracy”. In a medical context, we can take “accuracy” to mean “treatment that addresses, if not cures, the underlying condition.” Some diseases have well-known cures that don’t need to be precise, and with others, like cancer, precision does not always help.
I like to illustrate the distinction between precision and accuracy by quoting Archbishop Ussher’s estimate for the creation of the Earth. He names a precise time (6 PM local time at the Garden of Eden; the often-quoted 9 AM, which contradicts “the evening and the morning were the first day”, is apocryphal) on a precise date about 6 ka ago, with the main uncertainty being due to not knowing exactly where the Garden of Eden was. That’s on the order of a part in 10^8, which is extraordinarily precise for Ussher’s day. But it’s not accurate; the actual age of the Earth is closer to 4.5 Ga.
I don’t know of any case in medicine where “precision” leads to a “precisely wrong” treatment in the way Ussher’s Bible studies led him to his precisely wrong answer about the age of the Earth, but the field is yet young, and there will likely be plenty of opportunities to make such a mistake.
Orac does stipulate that he is first discussing this in relation to his specialty, which is understandable. But for more general application– if this 1K figure is indeed realistic, we could sequence every child born in the US in one year for the cost of a few dozen F-35’s or similar tradeoffs.
Now that would be a study.
It might actually tell us something useful about autism and obesity and such, as well as more physically acute conditions, as we follow the cohort through life.
Gets my vote.
Ah, but medico-marketing can be so persuasive–if you practice precision medicine, then ain’t you a “precise doctor?” Yup, a sad state when marketing/propaganda trumps science.
It seems to me that would be an expensive way of generating vast amounts of data that no one would have any idea what to do with.
From Sam Kean’s ‘The Violinist’s Thumb’ (Venter and Collins were big players in the HGP):
I think we may have to wait until computer processing power has become even cheaper before that kind of venture would be worthwhile.
I thought this was also interesting (op cit):
It looks as if knowing a person’s genome isn’t quite as useful as one might think.
Krebiozen once again quotes out of context so he can reply with a non-sequitur.
One major problem with precision medicine is that it relies on the false idea that a complex disease requires a complex treatment. This is not true, since many tumors can be treated by surgery without extensive molecular knowledge.
The same principle can apply to chemotherapy, by using the fact that all cancers have a common feature, uncontrolled growth. So the future of cancer treatment, immunotherapy aside, will come from G2 checkpoint inhibitors and protection of normal cells by cell inflation, associated to chemotherapy targeting dividing cells.
He had a valid point regarding processing power. That much data would be difficult just to store let alone process. Why bother collecting all that data when we don’t even have the infrastructure to store it, let alone analyze it? From the conclusion of the study talked about in an article Orac linked to:
Let me calculate, by Fermi problem methods, how much information is involved here. For purposes of this post, cows are spherical, etc. (I am a physicist, after all).
A human has a few tens of thousands of genes. (Probably more than a zebrafish, which has about 20k, but probably less than 100k.) Let’s call it 30k, just to keep things in round numbers. Each gene codes for a protein that has between a few hundred and a few thousand amino acids–let’s take 1000 for an average figure. At three base pairs per amino acid, that’s around 100 million base pairs per genome, not counting junk DNA. There are four bases, so we are discussing something like 30 MB of data for one person. The US has a population of a bit over 300 million, which implies about 5 million births per year. So we are looking at 100-200 TB of data for a single year’s birth cohort, or around 100 PB of data for the entire US population. That’s large but not excessive for a Big Data project these days (many physics research projects will produce hundreds of terabytes per year, and some discard large amounts of data to keep the total that low).
But I agree that it won’t do any good to collect the data if you don’t know what you are going to do with it. At least with a Big Data physics project, you have some well-defined science question, and to undertake it you need to convince a leading funding agency that your project is worth funding. Zebra’s proposed data collection effort sounds (if I may put on my reviewer’s hat for a moment) like a solution in search of a problem. I can’t answer for NIH, but I know that NSF and NASA do not like to fund fishing expeditions like that.
It wasn’t storage I was thinking of so much as the processing power to find associations between groups of mutations and specific conditions. I thought it was interesting that Venter and Watson (as in Crick) both had mutations that are associated with serious physical conditions that they did not have – presumably this is some epigenetic phenomena that turns the relevant genes off (or on). When you add in the epigenetic data that would be required to figure out what is going on I don’t think current computers have the necessary power.
From that PLoS Biology paper:
It goes on to say that this is an issue not necessarily solved by Moore’s law:
Krebiozen’s point was entirely valid, not a “non-sequitur”.
Eric Lund #8,
Eric, this would just be a database, like census information, that could be used by individual research projects. The sooner we have the data, the sooner that can begin to happen. Is the census a “fishing expedition”?
The numbers are big because this is a big country; maybe we could pay Canada to do it?
Still having language problems? Look up “non-sequitur” and look up “valid”.
Eric Lund, I run a CLIA NGS lab, and you’re underestimating 🙂 We don’t just sequence each base once; we do it 30x, on average, for statistical power. Our latest exome data (which is what you’ve described) is ~15-20GB in it’s raw form, not to mention any files that are made for analysis purposes. Genomes are closer to 120 GB.
As everyone’s been describing though, the big reason we don’t just “sequence babies” – or the whole population, as some have suggested – now really is because sequencing is the easy part. We can generate genomes, and even store them (though that’s a pain), until we’re blue in the face, but determining the needle in the haystack causative mutation in sick people is hard enough, much less predicting what may go wrong in healthy folks. We just haven’t done enough of the (much harder) genetics work to figure out how to appropriately interpret the data.
Daniel, how’s that search for evidence demonstrating cell inflation represents a effective treatment for cancer coming? Got anything that approximates proof of concept, let alone that it has the potential (as you claimed when you first appeared on RI a couple of years ago that it represented a ‘universal’ treatment for cancer.
The numbers are big because this is a big country; maybe we could pay Canada to do it? Oh, yes, please! Our little hamlet will get right on that.
: a statement that is not connected in a logical or clear way to anything said before it
: fair or reasonable
[email protected] (responding)
Swapping in definitions:
Krebiozen’s point was entirely reasonable, it is a statement that is connected in a logical or clear way to what was said before it.
I believe you misused “non-sequitur” to simply dismiss a valid point without addressing it.
Still having language problems? Look up “non-sequitur” and look up “valid”.
It would help if you looked up the former, as your attachment to that erroneous hyphen is quite grating.
^ Grumble grumble blockquote grumble.
What you do not seems to realize is that CIAC represents a lot of investment with no financial return. Fot the evidence, to make a comparison, it is like saying that gene therapy is the way to treat genetic disease: you don’t really need a proof of concept. The only thing is enough money to make things work. For me it’s better to work on G2 abrogation because, with drugs, you can attract investors, but I think that CIAC is safer. And I am quite optimistic that it will be done in one country or another.
“Krebiozen’s point was entirely valid, not a “non-sequitur”.”
Is itself a non-sequitur.
“Krebiozen’s point is not a non-sequitur, [because it is] entirely valid.”
But “being valid” does not refute the claim of something being a “non-sequitur”.
Why don’t you just admit that you learned something again instead of wasting bandwidth? I already know how to suck eggs.
Are you really trying to correct someone over a word you can’t even spell correctly, or am I have a stroke?
While the computing power needed to analyze the data would be expensive, I’m not sure the algorithms needed for bulk comparison are there, and we likely don’t even know where to look, getting the data would be a first step when its cheap enough. We might not be able to use the full data set it in any good way for 10 or 20 years, but then the children won’t have developed all of the conditions one might correlate to the genetic data, either.
Other issues of such a study would be providing the ongoing follow-up and maintaining adequate confidentiality. Each genome would need to be linked to the individual’s medical records for the patient’s lifetime in order to get the best data. This has the potential for abuse or inadvertent disclosure.
Another problem with big data is that the current academic reward system is based on experimental papers. If you just come up with a new interpretation of published data you will have hard time to publish.
Mephistopheles O’Brien #21
Basically correct on the first point. If we had the data tomorrow, it would be possible to begin preliminary sorting, which might well reduce computational cost later on as “sick” or otherwise characterized populations are identified over the decades.
Issues of confidentiality don’t seem that big a problem to me, unless is gets hacked and published on the internet… oh wait… . But realistically, no, assuming reasonable care and stiff penalties for misuse, I can’t see any real risk.
One thing I would be concerned about is that I/O speed might be the bottleneck (see my post #10). It seems like a waste of resources to store data we can’t use and that will likely need to be copied to new hardware just to be usable. The logistics involved in a project like that would be a nightmare. Heck with that amount of data software changes would be a nightmare as well. With the amount of resources required it’s really best to have the proper container prepared before trying to fill it. I think it would be a far more efficient use of resources to sort out the hardware and software requirements prior to mass collection of data.
I totally agree with you about confidentiality. The government, insurers, hospitals, etc haven’t been inspiring much faith in their ability to protect confidential data.
No, because there are specified uses for that data set. In order to properly apportion representatives, we need to know how many people there are and where they live. In the US, the Constitution provides for an “actual Enumeration” every ten years. (Details are different in other countries, but any halfway functional democracy needs similar data at regular intervals for this purpose–the alternative is “rotten boroughs” and “pocket boroughs” such as existed in the UK at various times in history.) Other data collected by the Census Bureau is routinely used for a number of studies: demographics, wealth distribution, and many others of this kind. It’s also a much smaller data set than individual genomes: these days, you could store all of that data on one commercially available hard drive. There are also laws (at least in the US) requiring that the data remain confidential for a period of time (72 years IIRC), after which they are made public–those census records are handy for people who do genealogical research since they can often be used to track where certain people move.
A population-wide genome database would presumably also be confidential–HIPAA either would or should cover it. But it involves quite a bit more storage. As others note, data analysis gets a lot trickier; e.g., you would have to have some way of tying it to medical records for it to be useful in any way (as MO’B notes above). Comparisons are a hard problem as well; if you are not careful about how you design the algorithm, you will get something that scales as N^2 where N is the number of people in the database (because with N people you have N[N-1]/2 pairs of people), or worse if you are doing multi-way comparisons.
To me, the confidentiality issues MO’B brings up are sufficient reason not to collect the data any sooner than it would be of practical use, because IME any sufficiently large database of confidential information eventually will be abused in some fashion. Think hackers stealing credit card info, or the NSA’s dragnet collection of telephone metadata, but on a much larger scale.
Since this does begin to intrude on my area of expertise – I can say that Zebra has no idea exactly how much data he’s talking about – both storing and processing.
The data alone would take years to process – even the initial collection and export into a database. There are also few, if any, current database technologies that allow for the storing and analysis of datasets even approaching the size in question & none that would allow for the connection of so many disparate individual items of comparison.
By the time the technology “might” be available to do it, it is, in all likelihood, probably that better diagnostic tests would already be available to solve many of the issues that would have made this data interesting in the first place.
The Census is also a relatively small amount of data – meaning that the variables are both known and quantifiable (there are only a set number of questions on the forms & check boxes for the individual person).
Full gene sequencing for hundreds of thousands, if not millions, involves an unknown number of variables – and no idea, how any or all of the variables might be connected to each other.
Eric [email protected]
Therein lies the issue I was trying to get at in #25. I was once involved in a project to upgrade server hardware and software as well as migrating our code base from VB 6 to VB.NET. It was a mess. There’s no reason to set ourselves up for that when there’s no current use for the data. It is so mcuh easier to build the proper infrastructure (both hardware and software) the first time around than to have to upgrade later.
What “disparate individual items of comparison” are you talking about?
I get the sense that people are projecting some complicated scenario onto a simple suggestion. You record the genome– admittedly a long bit of information– along with, say, social security number.
What’s the problem?
This is kind of zebra’s MO. I wouldn’t be surprised if he comes back and claims that he knows better than you.
Shorter Daniel @18
“No, I don’t have any evidence because MONEY.”
Where else have I heard this? Oh, yes–every alt-med proponent arguing that there are no funding for studies proving what they know to be true and that vitamin C/baking soda/aromatherapy etc. cures cancer because they aren’t patentable.
@zebra – and exactly how big is a single genome? How do all of the genes relate to one another & in what context?
You’ve put forth the idea that this data could be collected simply and just stuck somewhere – but by what method would you do so?
Having the data is merely one thing, but that data must be processed, stored and retrieved in some fashion for that information to be valuable – I am merely pointing out that the storage systems (i.e. databases) don’t exist to be able to handle this quantity of information in any meaningful way to allow for searching or using the results…..not in any sense that would take less than years – by which time, the information is no longer valuable.
I just don’t understand what it is you are imagining.
Let’s begin like this: Can you tell me the maximum length a sequence would have to be in order for you (you imply you are an IT person) to be able to handle it?
The money argument is valid whoever uses it. And yours is obviously a fallacy:
“This is kind of zebra’s MO. I wouldn’t be surprised if he comes back and claims that he knows better than you.”
Given how easily I got him to fold, maybe I do.
Some of you may have missed malia’s comment at #13 which was held up in moderation. It’s worth reading, I think.
My point, which I thought was obvious, was that most people seem to think it’s simply a matter of getting lots of genomes and correlating it with physical illnesses, including autism and obesity, and figuring out which genes differing from the ‘standard’ version* are responsible. It is a great deal more complex than that, with some mutated genes being turned off or on by other genes and by other epigenetic factors we do not yet understand. I think there are better ways the NIH or whoever** could spend $5 billion (assuming $1,000 per genome and five million births per year).
* How do we establish what is the normal human genome? The current ‘standard’ HGP genome is an average of a number of different people’s genome, but since we all have dozens of serious mutations this is a somewhat moot point.
** Oddly it was the US Energy Department that started on the HGP, the rationale being that they were investigating the effects of radiation on DNA.
Sorry about the grammar fails in my last comment – rushed editing.
It’s great to have a real expert pitching in.
“We just haven’t done enough of the (much harder) genetics work to figure out how to appropriately interpret the data.”
For those of us who are not experts (and don’t pretend to be), could you explain what the “genetics work” entails?
A quick question – we each have two copies of each chromosome (apart from the y chromosome in men, of course). Presumably both are sequenced – does anyone know how that works?
@ zebra – the reason the genetics work is hard is that there’s lots of ways to skin a cat, but I’ll try.
– A lot of it is basic genetics – we “break” a gene in a model organism and see what happens phenotypically. But this only works for genes that change a phenotype. Krebiozen’s post above, where he mentions epigenetics, talks about some of the reasons why it’s hard to correllate genotype and phenotype; but there are a myriad of others that we have to take into account – gene families where one gene can “rescue” another can hide a gene’s function, for instance, or genes that have such a subtle phenotype that we can’t pinpoint a change by looking.
– Some of it is looking at patterns in large cohorts of people – but again, a lot of that information can be masked by the same issues I mentioned above, and in many cases, when we start with “phenotype first”, there may be lots of different genes creating something that *looks* the same to us on the outside, which can confuse the issue.
These types of research aren’t sexy, nor do they use fancy machines, so they’re rather under-funded, to the frustration of every working geneticist, ever.
@Krebiozen – being diploid is one reason why we sequence at depth rather than just 1x. We physically chop all 46 chromosomes (23 pairs) in to manageable bits, and sequence them, presumably equally; then to put them back together, we compare the sequence data to the “human genome” (called hg19, which is really a mix of about 6 people) to find canonical differences from that reference, and we look for any differences in our patient’s sequence – where we have 50% of one nucleotide, and 50% of the other nucleotide, we know we have a difference between the maternal and paternal chromosome. But, what we can’t easily do is “phase” these differences – so, we don’t know whether a set of mutations that are physically close to one another live together on a single chromosome, or whether they’re dispersed between a chromosome pair. (FYI – this problem with phasing is also a contributing factor in determining if a mutation profile is “disease causing” or benign in some cases)
Your reply is greatly appreciated. But I remain puzzled as to why anyone would object to my suggestion, since I am offering what you appear to need. Let’s do this with Canada as a more manageable source of data:
Yearly births of 385,000 times 1,000 per genome is 385,000,000.
An F-35 (USAF, the cheaper model) is about 150,000,000.
So the US could buy 4 fewer of these (in a projected fleet of 450 plus) and easily cover data collection costs for our friendly neighbors to the north with their more rational health care system. Need more data, say from a larger country, lose a few more planes.
So this is what I don’t get. You say you are underfunded, but here I am giving you (and geneticists everywhere) free access to all the data you could possibly use. I understand that you need computing power to work with it, but since you aren’t paying for the sequencing, you have more funds to do the cat-skinning.
And the data will be useful even after every baby has grown old and died, assuming other records are maintained. The processing is only going to get easier and cheaper over time.
So really, what is the problem?
Thanks malia, that answers my question perfectly. It seems to me that phasing is going to be a serious issue in the future, unless/until someone finds an ingenious way of figuring it out.
And that the gene data is collected and labeled properly, and the other data is accurate, and that the other data can be perfectly correlated with the gene data, and that the other data is actually the data that is needed for the unforeseen analyses, and …
Think it through, don’t just go off half cocked, like you seem so fond of doing. Do like adults do: foresee problems and address them before they bite. This is your brain
fartstorm; be responsible rather than defensive.
Performing single sperm sequencing in the same male individuals?
That, historically, has been a very generous assumption. Storage media age–yes, even hard drives, but this was even more of an issue with magnetic tapes, which were standard for decades. Interface technologies change. Software that was designed for a particular computer architecture may not be maintained as computers using that architecture age out of service. Et cetera. Dealing with these issues takes time and resources. Only recently have people in my (relatively data-heavy) field begun to devote the necessary time and resources to addressing this problem.
Show of hands here: how many of you who are over 30 can read every single computer file you have created over the last 20 years? I certainly can’t. At least two pieces of software I used extensively in the late 1990s (ClarisWorks and Canvas) no longer exist, but I still have files I created with those programs. In 1995 my home machine was still a Mac Plus (new in 1988) with a 400k floppy drive and a 20 MB external hard drive with SCSI connection–I still have the machine in a box in my basement somewhere, but I have no way of exporting data on those media to something my current machine can read, without paying major bucks to somebody who has maintained such a machine. I also have zip disks.
I work with people who have decades worth of data collected, some of it on nine-track tapes. Nine-track tape readers were once ubiquitous in this field; today the number of still-operating readers in the world can be counted on the fingers of your hands. And many of those tapes are too brittle to read. Even when data are on media we can read, we have to hope that documentation was kept of the data format (the data is generally in a binary format, because data storage was at a premium in those days). In some cases software to read the data exists, but is in some ancient version of Fortran that won’t necessarily compile on a modern computer, even if it had a Fortran compiler (they are no longer automatically included with many operating systems). A significant fraction of the data were never examined more than superficially.
Now scale this problem up to the size that Malia mentions for the human genome. And ponder the question of who is going to maintain such a database, and who is going to cover the costs of maintaining it–which are likely to be the same order of magnitude, per year, as it would cost to collect all of that data in the first place.
There is no shame in underestimating how much of a problem this is–lots of people do that. I have had occasion to recommend rejection of a proposal that I thought was making that mistake. But it’s better to understand the magnitude of the problem before we collect a bunch of data we will never be able to use.
Eric Lund #48,
This is very strange reasoning. We would “be able to use” the data immediately– my suggestion that it might still be usable 100 years from now is only to illustrate that this is a long-term project.
I also think your experience with magnetic tapes and consumer-type software is truly irrelevant– this would be a serious scientific endeavor backed by world governments and scientific institutions. (In the 21st century– no “stone knives and bearskins”; no slide rules and punch cards.)
So I still await an objection from someone (who one hopes would be an actual expert) who can explain why he or she would not like to have this resource available for research.
After having just spent hours transferring files on floppy disks using a borrowed external floppy disk drive (because they no longer come installed in computers) I can very much relate to what you are saying. It also reminded me of how painfully slow the buggers are. Now all I have left to do is find a way to get files off of a zip disk I have. Yay.
zebra, I’m sure everyone would like this as a resource. However, yours is a hollow victory unless you can will it to happen or pony up the money and resources to make it happen.
Good thinking, but I’m not sure that would help. Even if a single sperm contained enough DNA to sequence (it doesn’t*) it will carry chromosomes randomly selected, so you wouldn’t know if a gene was from a maternal or paternal chromosome. Also you would only sequence half the man’s DNA, and sequencing multiple sperm to get the full genome would lead to the same problem i.e.not knowing which genes came from which chromosome. Until we can isolate a single chromosome and extract enough DNA to sequence that, I see no way of overcoming this, yet.
* Genome sequencing requires 250 ng DNA. A single sperm contains only about 3 pg i.e. 0.003 ng of DNA. That’s 4 orders of magnitude difference, which will doubtless take a few years to overcome.
Not a troll #50,
“zebra, I’m sure everyone would like this as a resource.”
Ummm…. no. Apparently several people think it would be a Bad Idea. Including Eric.
“Bad idea?” No, but it lands very much in the impractical category……
Daniel Corcos #52,
Wow! That’s impressive. I wondered about PCR but dismissed it. It still doesn’t really solve the problem though, since we still don’t know which genes came from which copy of each chromosome. Or am I missing something?
It’s the $5 billion cost that makes it a bad idea. I’m all for collecting data just in case it comes in useful, but not when it costs lots of money that could be spent on something with immediate practical uses.
I would say that with enough sperms, you would be able to say that the genes come from the same chromosome and answer the question of whether several mutations are on the same chromosome or not.
No we wouldn’t. For a number of reasons that have been explained already.
Eric Lund’s comparison is apt. Recall the paper I linked to said I/O speed is likely to be a major bottleneck. Our current storage media is inadequate. To be able make practical use of the data it would need to be moved onto faster media when it is available. This too has already been explained.
Krebiozen is absolutely correct in #56
I would also add that doing the data collection now would also unecessarily incur additional future costs to upgrade the software and hardware infrastructure as better technology becomes available. As I said before, it would make much better use of resources to take those funds and put them towards creating the necessary big data technoogies before collecting the data.
No reasons have been “explained” at all.
Are you saying that malia is some kind of psycho troll who is claiming to use genomes when in fact she isn’t? Or any of the other “working geneticists” she mentions? You must be really out of touch with the 21st century, just like Eric appears to be.
Could you point me toward the part of Malia’s comment @#43 that you thought showed an apparent need for sequencing every child born in the US for a year?
Because I don’t see one. On the contrary. She seems to me to be saying that the data they already have is still way too much of a research imperative all on its own for there to be any need or use for more.
By “you”, I am (obviously to me at least) referring to malia and all those working geneticists she invokes, and all future geneticists who might be able to do research because this data is freely available. As I very clearly pointed out, the costs saved on sequencing should allow for expanding the more substantive research activity.
[Yes, obvious to me, but of course, we can always distract from the topic by now discussing whether I should have said “you geneticists”, or “y’all”, or something else, and then go on to whether y’all is properly used as singular or plural, and so on. Or is there a hyphen in there somewhere?]
@zebra: There is a definite way that you can settle this dispute in your favor. To wit: write a proposal to the NIH or NSF (or equivalent body if you are outside the US) in which you will describe how you will collect the data, and what science question you will use the data to answer. You will need to convince the funding agency that you can do it within the constraints of the program to which you propose, and that your science question is of sufficient interest that the agency should fund your proposal rather than one of the competing proposals of comparable merit that they would otherwise fund. Then go out and achieve the proposed goal. If you can do this, you will have proved yourself right. The various people who are skeptical of your proposal, myself included, have given reasons why we think you won’t be able to achieve the objective within the allotted resources.
Right now what you are proposing is an “underpants gnomes” scheme: 1. Collect large-scale genome data. 2. ??? 3. Science! I know from experience, as do several others in the commentariat, that funding agencies aren’t going to fund proposals like that when there are already many more proposals with an explicit step two than they can afford to fund.
Just because you have hand waved them away doesn’t mean we didn’t explain. First there’s the technology issues I’ve brought up numerous times. Read the PLoS Biology paper I linked to for more detail but here are some of the issues:
Protecting confidential data:
Database size and search speed:
On the other hand there’s the issue that malia brought up in #13:
Throwing more data at the problem isn’t going to solve it. Just because you didn’t understand malia’s explanation in #43 doesn’t make it wrong.
And as she very clearly pointed out, since there is no present need to spend any time, money, or energy sequencing more data, that actually wouldn’t be a saving costs. It would be a waste.
It’s basically like saying:
“I don’t have enough clothes. I need to buy some, which I can barely afford to do. But I know! I can save money by buying the clothes I’ll wear in thirty years now!”
Let me walk you through this.
(1) All those working geneticists she invokes have unfunded priorities right now.
(2) Having more data “freely available” would not help achieve them.
(3) It also wouldn’t be free. You’d have to spend money creating the database and making it available.
(4) That money is presently needed for other priorities.
(5) So spending it on something else now would detract from rather than aid presently ongoing research.
(6) Furthermore, there’s no way to even say whether the research being done now will result in findings that could be better translated to practical applications in the future if such a database were “freely” available.
(7) So the whole thing might easily be a great big waste of money, now and always. Because:
(8) Spending money supplying people with something for which there’s no demand just about always is.
Eric Lund #62,
Since I never proposed getting funding from any of those agencies, such a test would be irrelevant. I merely did a first approximation a la Enrico, see #44 for the less ambitious version.
If you would read carefully, you would see that I consistently have implied financing from the general fund, by invoking a metric– the much-maligned F-35– which is often used for this kind of analysis. If the right people got the contracts, I could imagine even this US Congress finding a way to fund the project. Probably by cutting food stamps and not airplanes, but you never know.
So the real issue is whether such a database would be useful.
“Probably by cutting food stamps and not airplanes, but you never know.”
“(4) That money is presently needed for other priorities.”
You heard it here first: Let them eat genome database entries.
“Sorting”? Sorting what into what?
You’re magically* going to get 4 million human genomes each with over 3 trillion base pairs, and then…. do 8 trillion whole-genome comparisons? Why? Babbling about “computational cost,” a subject that you may reliably be assumed to know nothing whatever about (what word size for the sequences?) doesn’t cut it.
What sort of data structure do you imagine resulting from this exercise?
That’s because you’re a simpleton. Remember:
No, the security would be on the level of that for access to the VSD. Eveything has to be deidentified, which is no small feat when one has a lifetime of medical records.
Speaking of which, how precisely do you figure that’s going to happen?
* You forgot about consent, now didn’t you?
And @#13, you have a person who’s in a position to know telling you that it wouldn’t, then explaining why @#43.
Where you underestimated the size of the U.S. birth cohort by an order of magnitude?
Listing out-of-context [out of context] quotes doesn’t constitute an “explanation”, and anyway, whether I call it a non sequitur or a strawman or a Gish Gallop, this has nothing to do with my suggested project.
And, saying “it’s hard” is not an explanation. Nor is “it’s not perfect”.
And in particular, when we are talking about basic research, “but it might not yield useful results” is not just a poor argument, but actually stupid.
If this database existed, people would use it. It’s absurd to suggest otherwise. People would choose genetics as a career exactly because of the opportunity. And almost certainly, it would spur the kind of innovation and development of tools that you are talking about. Kind of like DARPA’s little experiment with connecting computers in different locations, you know…
I really do think a lot of people here are simply “stuck” with their attachment to 20th and even 19th century paradigms. You can’t imagine a different way of doing things, or it makes you uncomfortable, or you feel threatened. Too bad.
Single cell sequencing is a thing; you can (and we have) sequenced a sperm, if one really wants to – and lots of animal science researchers want to. However, for humans, not super practical. First off, only useful in biosex males. Second off, those are germline cells, which are arguably different than the somatic cells (cells that make up the rest of your body) that you’d be interested in if you were doing diagnostic sequencing. We’re getting closer to being able to do phasing – there are some nifty biological tricks we can use with standard sequencing, and there is an adorable little sequencer, called the MinIon, that’s in beta testing right now – our lab got 30KB fragments off of it – but neither of these options are very cost or time effective….yet.
Would such a database be useful in the real world?? Honestly, the answer is that we don’t know, because we don’t have enough of the biological groundwork to make that decision yet. Might it be useful in the future? Perhaps.
IF (and only if) money and time and storage space and processing power were not an limitation – AND doctors took a thorough, objectively defined, descriptive medical history that was always coded properly in an EMR that followed each person through their life so we had good phenotype data on every condition on every human ever – sure, future human genetic researchers would theoretically love a resource that could interrogate the genomes of all humans everywhere.
However, @ann has a pretty good run-down of the ideas for why it’s not being prioritized. Avenues of research that are arguably more fruitful are currently underfunded, and there are lots of technical hurdles that have been mentioned up-thread that would need to be innovated and properly managed in order to make such an endeavor feasible. Beyond that, there are some HUGE ethical considerations that are still being hammered out. Lots of people are categorically NOT OK with having their genome stored somewhere. Lots of insurance companies are looking to make unsubstantiated claims for denying coverage based upon preliminary genomic data as “pre-existing conditions”. There are questions about whether adults should be making decisions about whether a baby’s genome should be sequenced, rather than that baby when they reach majority. Etc, etc.
You can call something out of context but it doesn’t make it so. That paper quite clearly explains the technological challenges involved in big data genomics.
Similarly you can say people would use the data and they certainly would want to, but both the technology and our understanding of genetics is not at the level where this data would be useful. You have yet to provide any counterpoint to any beyond “I don’t think so.”
Our CPU’s, storage media, and database technology is not fast enough. Our security is not good enough. And even if it was we don’t understand enough to make use of the data. I provided references about the technology and malia is an expert in the field who told you how our fundamental genetics knowledge is lacking. You on the other hand have other nothing in defense of your idea.
Should be “…provide any counterpoint to any criticism beyond…”
@malia – a question almost, but not completely, off topic if you please.
Do you have any opinions on the commercial DNA testing companies that scan for genealogical ‘roots’? Any opinions on the validity of the results?
The word is that there is native American on both sides of my family tree, but as near as I can tell, it would be so far back, I’d probably be eligible for the Mayflower Society. I’d like to try to settle it one way or another.
It’s positively darling that Z. is now asshurt over this and doesn’t understand the error.
We’ve (tinw) been through this, but a careful prescriptivist would omit the apostrophe.
My comment about 20th century thinking wasn’t intended for you, but…”pre-existing conditions”* ?? I thought we were eliminating that little canard here in the USA.
But I get the same sense of Nirvana Fallacy/Grandiose Strawman from you that I do from others. Let’s say we collect the data as I described in 44, and sure there will be some opting out, and sure there will be errors– as happens now, but you keep working.
What I really don’t understand is why you think this only has utility for some great project involving all of humanity in the future. Assume the data is available; why will there not be young scientists (or old corporations) picking out sub-populations to study? I just don’t see the constraints described in any concrete form. What do I need more than a statistically significant sample, and just enough computing power to handle it?
*(Or “preexisting”, I’m sure Narad will be chastising you about that any time now.)
And thus comes the fantastic megalomania.
The current crop of geneological tests are pretty good at giving you an idea of ancestry, so I don’t see any reason why folks shouldn’t try it if they’re curious. They don’t, however, have good demographics for some smaller outgroup populations (and some Native populations are included in this statement), either because of lack of genotyped members that can be used in the database, or because the group has an aversion to genetic testing (as is the case in some SW-USian Native populations).
I’ve personally never done one (because I’m cheap); but others in my family have, and the emotional responses to the results have, in all cases, been the most interesting parts. My step grandma called me in tears after hers, thinking she must have misread it because it only told her she was of German-ish descent, and she *already knew that*! And my Uncle’s test started a small Facebook-family-feud because there was too much Celtic genetics, and not enough French/Gallic genetics. My family may need a new hobby.
That’s a stylistic choice, not rank ignorance.
See my response to malia. But maybe you can try explaining this.
What exactly does “our understanding of genetics is not at the level where this data would be useful” refer to???
How can genetic data be “not useful” in the study of genetics???
Not to repeat myself, but…
Not when it’s a huge cost- and labor-intensive fishing expedition for something the potential uses of which nobody even knows how to recognize, identify, define or describe yet.
Because in that case, you’d just be stating one of the reasons that putting the cart before the horse is a bad idea by saying it.
It would actually be — and is! — stupid to pretend otherwise.
When people can’t use it to study genetics.
Because, as she already clearly stated @#43, they don’t yet know what to look for and/or how to look for it in the data they already have.
Research is not being held up due to a lack of sequenced sub-populations As she also already clearly stated, sequencing is the easy part. The present imperative is interpreting the data they already have.
It’s actually the other way around. Supply does not create demand. It’s absurd to suggest otherwise.
What opportunity? People already — right now! — have more access to sequenced sub-populations than they know what to do with. Literally.
I don’t see how. Please elaborate.
Are you kidding? If there’s a single idea in existence that’s more 19th-century than definitively quantifying the genetic destiny of humankind in order to use it to build a better future through science, I don’t know what it is.
Storing and using huge amounts of data is a major technical challenge, one that both big tech companies (like Google) and government agencies (the NSA) are working very hard on right now.
I can only think of two uses for a database of just genomes (as opposed to genomes + medical data): for identifying bodies, and for identifying criminals.
Would sequencing everyone answer some scientific and medical questions? Probably yes. Are we capable of analyzing that data now? No.
To expand upon this further: Even if the data were available, there are resource costs (labor, if nothing else–people don’t work for free!) to searching the database. To pay for those costs, potential users would have to write a proposal demonstrating that (1) they have a well-posed research question, (2) the question is of sufficient importance to merit funding, and (3) the proposers have a methodology that they can show is likely to answer the question. At least that’s how the funding agencies I am familiar with work, and I have no reason to think NIH is different. Ann’s point is that researchers in the field don’t know what they are looking for, so they cannot write a proposal that satisfies the third point. It would be worse than looking for a needle in a haystack, because at least we know what a needle looks like.
I have been on review panels. I know what happens to proposals that don’t have a coherent methodology (or, as I described them above, “underpants gnomes” proposals). If you’re lucky, you’ll give the panel a good laugh for a couple of minutes or so before getting a rating that tells the program manager, “Don’t even think of funding this proposal, even if you have infinite resources.” For programs that practice proposal review triage, such as the NIH R01 program, you won’t even get that satisfaction. So nobody gets funding to look at the database, which just sits there consuming resources (as I pointed out above, there is a nontrivial cost to maintaining the data once you have collected it).
Maybe in a decade or two, once scientists in that field have figured out how to ask meaningful questions that can exploit such a database, we can talk about building that database. In the near- to mid-term, that money is better spent looking at data already in hand and trying to figure out what we can do with it. Only then will we be able to supply the second step in that business model.
Eric [email protected]
Not to mentiom that every action performed on a database has resource costs. In normal situations you don’t notice which is probably why zebra apparently is unaware. However, for such a large database every single query will cost a significant amount of CPU cycles, disk reads, etc. Until there’s a need to analyze such large amounts of data it is far more efficient to work with smaller datasets.
From what malia said it seems like research has not progressed to the point where such large amounts of data are necessary or even usable. So, I would say that even if there were such a database, researchers wouldn’t be scrambling to use it. It doesn’t make sense to use a huge database with its inherent latency when a much smaller dataset would suffice and be much easier to manipulate.
Heh, redundancy. Important for databases, not so much comments.
To add on to capnkrunch and Eric Lund: it is entirely likely (even probable) that doing the kind of Big Data deep-database search we’re talking about here will require not only new hardware and new software, but new mathematics.
Looking at data sets as large as the genome of every child born this year (for a small subset) will need new algorithms at the very least. My mathematician friends tell me this is an exciting area of research, but while math is faster than pretty much any other science, it will still take time to develop these new methods.
Eric Lund #87,
Let’s review what malia actually says at #43:
That doesn’t sound like someone saying “we don’t know what we’re doing we just run around like headless chickens”. It does sound like someone contending with agonizingly slow processing and horrendously “noisy” data and eye-straining poor resolution and stuff like that.
I find it difficult to reconcile this with what you are saying– that “these people don’t even know what they are looking for”.
What I am suggesting is that if we pick up the tab for the sequencing– at the cost of a few airplanes we don’t really need– and build a resource like the Census or NOAA/NASA and other such entities– there will be competent people applying for grants to make the data more usable, at least, and others with a cogent proposal for basic research, and others with medical and other applications. You are free to disagree, but you have to offer more than generalities that fit with your experience, which simply may not be applicable here.
just a tech, capnkrunch:
“the kind of Big Data deep-database search we’re talking about here”
Since you are obviously not talking about the same thing I am, I guess we’re both right.
As I said to malia, if I am looking for a correlation for example, I only need a statistically significant sample to work with. Do you guys not understand that it is trivial to extract that from the greater database?
The issue, which is probably above your heads, is what constitutes a significant sample, because of the noisiness of the genomic information. But, we are far more likely to figure that kind of thing out if we get multiple people working on multiple problems in the context I describe to Eric.
Two issues here. First, why waste all the resources building this database when we only ever need a small subset?
Second, querying a database is not zero cost, there costs for the retrieving information from the logical data structure as well as reading from the physical disk both of which scale with size. With a database that large even generating an arbitrary subset is not trivial. If you have specific parameters in mind it only gets worse. It’s not as simple as cutting a slice of pie. Count computer science as another topic zebra has no grasp of.
Did you really say “why waste all the resources building this database when we only ever need a small subset” ??
I hope the rolling doesn’t pop my eyes out of their sockets.
OK, tomorrow I will look for at least marginally rational responses from malia and Eric.
OK, let’s back this train up a bit, because I feel like I am missing something.
[email protected]: Please tell me if I am understanding your proposal correctly. You want to search our genome database a correlation. A correlation to what? Are you talking about search a subset of the database (say, children with a very specific genetic disease) to find common gene patterns?
Or are you searching the whole dataset for a specific coding sequence?
And how can we know how big a statistically significant sample will be before we know what we are looking for? If it is something common we might need a relatively small sample, while if it is something very rare, we might need a very large sample.
Usually you decide the level of statistical significance you want, then from that calculate the power you need to find that, and that in turn will let you calculate the necessary sample size. Is that what you mean?
By doing what? What would they be proposing to do to the data, using what methodology to accomplish what?
Which is what, exactly? As in “I propose to establish the genetic basis for…” And how do you imagine they would be cogently proposing to establish it? By looking for correlations?
Between what and what else? How would they look? What would they be looking for?
And you know this because you see it in your Magic 8 Ball?
Perfectly sensible question.
ORLY? How? I mean, it would be more straightforward to say what database?, given that all you have is yet more sophomoric posturing, but just for fun, show everybody what the query would be here.
Seriously, go fυck yourself. Your dismal performances here have only demonstrated that you seem to know nothing whatever about any topic that you choose to spout off on. I can scarcely imagine what, if anything, you have postsecondary training in.
OK, let’s see what’s over whose head. Define your terms.
^ Don’t forget “statistically significant sample,” either.
Oh, it’s vastly worse than that – remember the “preliminary sorting” bit? He has no understanding whatever of data structures or algorithms. (Fun fact: multiple sequence alignment with sum-of-pairs scoring is NP-complete.)
Oh dear, hoofbeats fading into the distance. I’d say “not enough carrots” if only Z.’s psychological dynamics weren’t so painfully obvious.
Behold, Z.’s comment 91:
This, of course, is yet another repetition of his ciphering in comment 44:
Note that the issue has been pointed out indirectly by others, and I did it explicitly in comment 70. Add “order of magnitude”* to the pile of things that Z. doesn’t understand.
* And, ironically, cost analysis, given the context for this idée fixe of a “yardstick”; he in fact posits $4 billion as the total initial cost from inception to end of initial sequencing. Absolutely classic Z.
In the absence of any notion of what to do with it, a hundredth or a thousandth of the data would meet the same criterion.
All this big-data talk always brings to mind the Maxwell’s Demon of the Second Kind from Cyberiad.
Spoiler alert: “The demon prints out this information on a long paper tape, but before the pirate realizes most of the information is completely useless (although strictly factual) he is buried under the endless rolls of tape, ceasing to bother anyone.”
herr doktor [email protected]
Don’t say that, zebra will roll his eyes right out of their sockets.
On that note, here’s an interesting case of zebra revealing his ignorance. In #93 I made two points. One was rather weak but required some knowledge of databases to rebut. zebra chose to ignore this one and instead opted to handwave away the much stronger, albiet simpler, argument (the same one herr doktor bimler made here).
zebra proving his lack of knowledge about a subject in one breath while insulting others in the next is nothing new. However, I think the way it came across in this case is somewhat novel.
The NP-hard problems are very likely going to remain NP-hard. New heurististics may be new mathematical applications, but I’m not so sure about the “new mathematics” part.
There’s a list of problems that were in need of attention a decade ago here (and many of the reference links are broken, rotted, or both), but I just don’t have the time at the moment.
Narad: Add “order of magnitude”* to the pile of things that Z. doesn’t understand.
I’m beginning to think he doesn’t understand anything at all. I mean, I don’t really understand computers, but I get the practical objections, not to mention the idea of obsolescence. I mean, does he seriously think computer tech is going to stand still for a hundred years? The computer I had ten years ago is vastly different from the one I have today, for example.
Eric and Justatech: I feel your pain. I’m trying to move files around, and even with two relatively close in age computers, it’s a gigantic pain. And, hey, I still have some floppies around. Don’t know what I’m going to do with them.
I have been a lurker here for years, yet never felt compelled to post before now. Sorry my first post is such a long rant. I realise that trying to dismount Zebra from his high horse is a futile task. But as a life-long geneticist/genomicist/molecular biologist or whatever it’s called these days, I just want to add my support to those who have already tried to explain how futile/wasteful Zebra’s idea is, at present. Better idea is to work on getting the ethics/consents and funding to collect high quality DNA (or at least blood) and immortalised cell lines from all births (or as many as consent), then when both the sequencing technologies plus data storage and analysis infrastructure are more mature, you can do all the sequencing AND make some sense from it. By then you might even have some decent phenotype data from this birth cohort which will help you select good subsets of the data to answer specific questions. I would also suggest that re-collecting samples from the cohort over time will yield even more useful data to track changes in epigenetic, RNA and protein expression over time. Ideally you would also want to track environmental and sociological data to work on gene-by-environment questions. Now we are talking about really REALLY big data, but potentially much more useful than just sequencing everyone born using current technology which is going to improve greatly over time. I am pretty worn out from having to re-impute genomic data from stuff done in the past. Imputation should only be done when you have no other choice.
I also see Zebra does not have any grasp of how big the computational and statistical challenges are. Think about 3 billion base pairs (that’s 6 billion bases) in an average human. Now think that each human has about 10 million of the base pairs as SNPs, plus CNVs, methylation differences and a whole host of other potential genomic differences between individuals. Let’s leave phasing out of it for the moment, though this is also a big problem. Now you need to screen all 3 billion base pairs to get the basic data that Zebra is talking about (keep in mind that is maybe up to 100 billion reads of DNA for decent filtering and mapping), then you need to do multivariate analysis of all the cross-wise comparisons between each base of “normal” DNA and affected individuals (whatever the effect is you are looking for), and do an enormous amount of correction for multiple comparisons to maybe get an idea of genetic difference that may be driving the effect. The more genes involved in a particular phenotype, the smaller the effect sizes and this is just the start of an answer to why it just doesn’t make any sense to do this right now. Do it when you have a specific question to answer, using the best technology you have available (and can afford) at the time to answer that question. And never, ever assume it is just about the plain text sequence of 3 billion base pairs that you can store in a database. That is a defunct idea spread by the hype of the HGP, which even people like me bought into at the time.
Or to maybe to put it in terms that Zebra can relate to, would you buy 5 million F-35s just because they were $1000 each, when you may not need them at all, or not for 30 years? Maybe you will need 100,000 of them before then, would you still buy all 5 million, just in case? Where will you store them, who will maintain them, and will they be of any use in 30 years’ time? Or will the war technology have changed so much by then that they are basically of little use except at air shows…Now keep in mind that genomic technology is evolving much faster than war technology at the moment, and you can see there are a few problems with just stock-piling for the sake of it, even if someone stumped up the money for it (which is unlikely right now). Using the F-35 analogy, wouldn’t it be better to put that money into buying what you need right now to fight your current wars, and investing some of it into research for future war technology improvements (if war was your business). That is all genomic researchers are saying. There are better uses for the money and technology right now, in both diagnostics (fighting the current wars) and research (improving technology for the future).
Personally, I like to dream that P=NP but realisticly you’re probably right. That said the work being done in compressive genomics is pretty interesting . I don’t know enough to differentiate “new mathematics” from ” new application”. I would guess that new compression algorithms are more on the side of new application though.
Personally, I think the most interesting stuff is the work being done on homomorphic encryption . That’s some seriously cool cryptography. Again though, not sure it constitutes “new mathematics”.
The SDSS pipeline, as of 2001, represented 25% of the shebang. The raw data aren’t really comparable (as the PLOS item notes;
datanoise has a dedicated garbage channel early in the chain), but, I mean come on, already.
There’s something simple here, and it ain’t everybody else.
I’m confident that unanimity will be found as to your doing it more often.
I think the fallacy of Zebra resides in the airplanes’ argument. Very few of us need these airplanes actually, and, personally, if I had the money of these airplanes, this would change my life; much more than if I had all these sequences, which I cannot say if they will be useful one day. Maybe we can use this blog to say that the money of the airplanes should be given directly to us. We certainly know better what to do with it (except maybe for Zebra), but I fear that some very powerful people really need these airplanes.
Retro Pump #107,
An excellent (and obviously knowledgeable) critique of the proposal. Before I respond, I’ll just ask– why didn’t you jump in right at the beginning and say: “Zebra, I see where you’re going, but here’s how it would have to work.”
The whole point of putting the idea out there is to promote this kind of discussion. Why lurk when you have real information to contribute?
Anyway, as to the substance. I don’t recall that I ever suggested that we would get all the samples and then have a crash program to sequence them all in one year. Even I am not that crazy, and to suggest that is what I call a grandiose strawman. The point, which you appear to understand and have no problem with, is to establish the cohort
So, what are the steps or components?
Getting the samples is trivial. Will some parents opt out? Stipulated. Will that skew the sample? Highly unlikely, except perhaps where you have a somewhat inbred religious or social group, and we would be aware of it
Developing a registry of phenotypes? Well that’s why I suggested a country with a rational health care system and cooperative citizens like Canada. Yep, I will also stipulate that in the USA there would be all kinds of irrational paranoia about such a project. Bad idea to do it here, numbers aside.
Why do it sooner rather than later? OK, that’s where I think we disagree.
As I’ve suggested, the first reason would be to stimulate research and the advancement of tools and techniques and lowering costs. It’s a big contract.
But the real benefit to me would be getting as much done before people in the cohort start manifesting whatever we are interested in. Are you saying that it wouldn’t help to have the genomes to hand (whatever subset of the data has been processed so far) in order to inform further action and study? This I don’t get.
If the data is there, and someone says “I think there may be a (genetic) correlation with x”, this can be examined initially without any patient interaction at all– that is, by people outside the clinical setting where confounding factors abound.
As to the analogy, no, it doesn’t work. How about “why subsidize rooftop solar when everyone knows how to make and distribute electricity from fossil fuels, and solar technology will be better in 20 years anyway.” (Or any number of examples like pc’s and internet and electric vehicles and so on.)
You don’t get the progress by sitting on your hands; solar panels et al are cheaper and better exactly because we started buying them even before they were “perfect”. Which is why I mentioned the Nirvana Fallacy along with grandiose strawmen.
[…] Orac is on vacation recharging his Tarial cells and interacting with ion channel scientists, as a good computer should. In the meantime, he is […]
The strategy in the UK and funded by the NHS (Government)is to target specific groups rather than just the population.
“The 100,000 genomes project.”
The project will sequence 100,000 genomes from around 70,000 people. Participants are NHS patients with a rare disease, plus their families, and patients with cancer.
I understood that to mean “a year’s worth of births.”
You say that as if it were a bad thing. The Dunning-Kruger is strong in this one.
As has been mentioned overnight, another area in which major advances would be needed before a large-scale genome database could be useful is algorithms for sorting and correlating the data. For instance, if the best available algorithm today has an O(N^2) run time, but an algorithm with an O(N log N) run time were possible at least in principle, it would be well worth spending money to develop the latter. For N of the order of the annual US birth cohort, the difference between N^2 and N log N is five orders of magnitude, or roughly the difference between having a query take one second (as our equine-pseudonym commenter seems to think) and one day (because 2^20 = 1048576). In a genome database, a more realistic N would be in the billions if not trillions–what would take the N log N algorithm one second would take the N^2 algorithm anywhere from a few years to a few thousand years.
We can’t count on Moore’s Law to hold forever. There are physical limits: signals cannot travel faster than the speed of light (about one foot per nanosecond), and a logic gate cannot be smaller than a few atoms in size. Maybe we can do better with quantum computers than with conventional computers, but that, too, is a long development road–the current state of the art in actual quantum computing hardware is a benchtop experiment that is barely able to tell us that 15 = 3 * 5. (I’m not being quite fair here: that computer was intended to demonstrate a factoring algorithm which is much faster for large numbers–O(log N) instead of O(N^1/2) where N is the number to be factored. Which is a big deal because encryption algorithms depend on the fact that factoring large numbers with non-quantum computers is computationally hard.) And of course, even if quantum computers are developed to a level where they might help, the data migration problem strikes again, in spades.
Those are all products that were developed after the technology and understanding that made them possible, not before.
The Nirvana Fallacy applies to things that are non-ideal, not things that are useless and unnecessary.
It just dawned on me that zebra’s thinking is essentially that if you put a bunch of stuff that uses the same code together and connect it up, it will be like the internet. (Hence the 21rst-century snark.)
Maybe everybody else already grasped that, though.
Hmmm. This doesn’t quite fit with what you were saying before.
And once again, I’ll bring up the confidentiality issue that you previously handwaved away:
Just because you think something doesn’t make it so. Are you beginning to see a theme here? Data breaches aren’t even the biggest issue, it’s ablut properly deidentifying the information. There’s a reason genomics is driving some real cutting edge crypto*. Since I apparently can’t quote without it being out of context here’s some required reading so you don’t embarass yourself again: Routes for breaching and protecting genetic privacy.
*It’s probably over your head but for extra credit go read the paper on homomorphic encryption I cited in #103.
That’s a dead link in your last comment, please repost.
There are already some interesting privacy issues emerging from sequencing people’s genomes. James Watson wanted his apoliprotein E status (associated with Alzheimer’s)withheld when his genome was published, but some geneticists pointed out that it was easy to figure out from the surrounding genes that had not been redacted. These and similar issues will doubtless become more prominent in the future.
No idea what you are getting at with your first two quotes. You need to elaborate.
But, since you appear to be part of the USA paranoid group I mentioned, let me ask something about “genetic privacy”.
I have a list that matches social security number with a genome. I have a list that correlates some medical condition with social security number. The social security number is encrypted, if you like.
Now, if it we go back on the momentum to eliminate the concept of pre-existing conditions as a way to deny insurance coverage or payment, I could imagine someone hacking the medical condition registry and selling the info to some nefarious insurance executive. I don’t think this will happen, but arguendo.
But could you please explain what possible deranged reasoning makes you think someone would hack the genome data, which you keep freakin telling me is impossibly difficult to turn into useful information in any useful time span??
My eye rolling is nothing compared to my headshaking at this point. [head-shaking]
Me, I think it would be neat to wear a t-shirt printed with my genome in tiny little letters– someday I expect one will be able to order such a thing from Amazon. What’s the problem?
Oops. Here’s the link: http://www.nature.com/nrg/journal/v15/n6/full/nrg3723.html
If zebra were sensible then your inference would be reasonable. However, since zebra was responding to a point made by literally nobody, whilst ironically complaining that it was a “grandiose strawman”, I’m not convinced that your interpretation of his intentions is correct.
It was in response to Retro Pump pointing out (apparently obvious only to the two of us) that the genetic information is stored in the samples. He says basically “keep it stored that way until we reach some future capability” and I say no, lets get to work right away turning it into accessible data. For the reasons covered in 112.*
I never suggested, as some seem to keep implying, that we were going to have the entire databast digitized immediately, and that we would then do analysis on the entire thing. Hence “grandiose”.
*Note also that this is why all the nonsense about deteriorating media and changes in software is nonsense. The samples are there with the information to be extracted in the future, with future better tech, if the data gets corrupted.
So, I have to apologize, I was wrong about the mechanics of data storage.
According to my database systems engineer (I’ll call him DSE), you can store 1 petabyte (Pb) of data on one server rack of hard drives ( a server rack being the size of an overgrown filing cabinet). If you don’t care about your I/O, you can do that for about $100K. If you want a searchably useful I/O rate it will be more like a million dollars. (The DSE considered this chump change but then I reminded him this was science and not Silicone Valley.)
The much, much bigger problem in the opinion of my DSE is the issue of data entry. The USA does not currently have a unified EMR or EHR system (electronic medical record or electronic health record). While a universal EHR system has been on the WHO’s to-do list since at least 2012, not a lot of progress has been made on that front.
So it doesn’t matter than we can store the data and search the data, we can’t get the data *in* in the first place.
So zebra, I apologize, I was wrong to say that the storage problems would make your plan unfeasible. It’s the data entry that makes it unfeasible. Maybe a country like Canada could do it (if they had the money) but I still believe that there are issues beyond funding.
(A side note about starting with infants: for late-onset diseases like obesity, type II diabetes, CAD and dementia we would have to wait an awfully long time for diseases to manifest before we can determine anything about the genetic basis.)
That was my bad for botching the link. Do some reading. At the very least take a look at Figure 1. The issue isn’t exfiltrating the entire database, it’s that genomics can potentially be used to identify otherwise deidentified protected informtion. As to why anyone would want to do that, it doesn’t particularly matter because HIPPA says you need to protect that information. As I said, there’s a reason why there is serious cryptography research going on to allow queries without exposing the raw data.
As to the social security thing, I was under the impression you were suggesting that we only collect genome data because Lawrence said the database would get too complex if it included all the medical data. Apparently I misunderstood but your actual idea still doesn’t make sense. Using SSN’s to index the database unnecessarily exposes additional identifying information. Not to mention, it was the genome itself that Lawrence was referring to as being composed of huge amounts of variables (go reread Retro Pump’s #107; try reading for comprehension this time).
Remember, we, the wealthy USA, and I would hope the rest of the world, are paying the [Canadians] to do this. I would throw in even a couple more F-35’s if need be. And someone mentioned that the Brit NHS was doing something along these lines with phenotypically distinct populations. Rational healthcare systems do have advantages.
But your point about late-onset conditions is quite correct. Note that my initial comment referred to obesity and autism, the former unfortunately affecting more and more children. And that follows from Orac suggesting, IIRC, that genomics might be better applied to those kinds of less-acute-public-health-chronic issues.
So yes, maybe we will have figured out the genetic components of diseases of old and middle age before the cohort gets there. But I am also arguing that this project will help accomplish that indirectly by stimulating R&D.
I am really trying to comprehend what you are saying but it ain’t working.
There’s a government database of genomes indexed by ss#.
There’s a government registry of conditions or diagnoses indexed by ss#.
If I want to do research on condition x, the government sends me a list of the genomes indexed by a completely arbitrary set of numbers which the government indexes to ss#, so I can follow up if some need arises, but they could even leave that out and I could still do my research.
Where do I go to use the genomic information to find out what about what about any individual?
Again, there’s no reason to use SSN’s as the index. This system you’re proposing requores that every request requires the an additional column mapping the new arbitrary index to the SSN. There’s no need to use the SSN. You just set up a hash table for the index and you can then reuse the hashes as the index for a queried subset and there’s no need to map the subset’s index to the master databases.
Again, I’ll refer you to the paper I linked to in #123. There’s attacks that can be used to deindentify genomes or that can link a known person’s genome to their PHI. I’m not going to explain it all for you but here’s a very simplified explanation of one type of attack. Markers from a deidentified genome are referenced against information in publically available genealogical databases. These databases contain personally identifiable information. Because you requesting genomes from people with a certain condition you have correlated PHI with a specific person. HIPAA demands that not be possible.
This is not entirely theoretical either:
And researchers actually successfully used this method: Identifying Personal Genomes by Surname Inference (pdf).
Correction for #130
“The system you’re proposing requires that every request an additional column mapping the new arbitrary index to the SSN.”
I thought about it and there could also be just one additional column mapping the arbitrary index to the SSN. But then there’s no need for the SSN because this is the same I mentioned in #130 except it unnecessarily includes SSN’s as well.
Zebra:what possible deranged reasoning makes you think someone would hack the genome data, which you keep freakin telling me is impossibly difficult to turn into useful information in any useful time span??
First of all, they are assuming this can be turned into useable data, which in turn makes it vulnerable to hackers. As to why hackers would try to break it..uh, because it’s there? Because it could be used to embarrass and discredit people? I could imagine a lot of ways a break-in could get nasty fast.
Why would you even want your genetic data hanging out for all to see? What if you had schizophrenia or bipolar depression or a personality disorder running in the family?
I would certainly hope that there are no such government registries or databases, because as [email protected] notes, it would be illegal to maintain such databases in the US. There is a law (HIPAA) which requires medical information used for any form of research to be stripped of any and all identifying information. Social security numbers constitute identifying information, because they are unique numbers associated with specific people. For instance, anybody who knows your SSN can obtain credit in your name. (This is one of the ways that identity theft works.) If your medical history were also available to somebody who knows your SSN, he could use it to, e.g., legally obtain certain drugs that are not available over the counter, and sell them at a profit. Which leads to a nightmare scenario in which you can’t get the medication you need, because a program intended to sniff out prescription fraud has noticed that somebody using your identity is obtaining said medication in quantities well beyond what is needed for personal use.
Any database with financial information linked to individuals is a potential target for hackers. You need only pay attention to the news to see this: such databases are hacked on a regular basis. A database with everybody’s social security numbers and genomes/medical history in it might as well have a big sign on it saying “HACK ME” in letters big enough that, were it physically located in Washington, you could read it from Perth without reading glasses. No criminal penalty could deter all hackers, because the reward is so great.
There would have to be *some* unique identifier connecting the individual with his/her genome; otherwise there’d be no way to connect the individual’s medical records with his/her genome. Without that, all you have is a massive pile of bytes from which you could not learn anything about health.
But this whole project would require a database connecting every individual in the cohort with his/her medical records and that *would* be a target for hackers even if the genome info were not useful. Imagine the blackmail possibilities for a hacker who knew about treatment for venereal disease, drug addition, schizophrenia or bipolar depression or a personality disorder in the family as PGP suggested.
capnkrunch et al:
The hospital knows the ss# of the baby.
The hospital sends the ss# and the sample to the government.
The government attaches an arbitrary number (xss) to the sample and to the digital form of the data. But it must keep a table of ss# and xss#.
The doctor knows your ss#.
The doctor sends the information ss# and diagnosis.
The government converts the ss# to xss using the table.
It stores the diagnosis indexed by xss.
So, as the person doing the research, when the government sends me a list of genomes associated with the condition I am studying, I have no contact with ss#, and as I said, the government doesn’t even have to tell me the xss.
Somehow, you think this is less secure than the traditional method you are defending, where the subjects are directly involved with the researchers? Where “everybody knows your name”, and you can be secretly photographed or fingerprinted, and your credit card information and other easily readable health data is in the office, and your facebook page discusses your condition, and you order drugs online….
Nope. You still don’t understand what a hash table is. It stores the hashes without any need to keep the raw data (i.e. SSN). In any case, using the SSN is unnecessarily. It makes more sense to have a unique identifier known to healthcare providers (think MRN) that is standardized. That way it minimizes consequences in case there is leakage.
The traditional methods don’t work here either. As I told you before even just a genome is potentially personally identifiable. Hence the work on homomorphic encryption that would allow comparisons without exposing raw data. This is not advanced enough nor are computers powerful enough for this to be practical on a large scale. We can argue about your misunderstanding of how database indexing works until we’re blue in the face but this is the more difficult issue and it’s the one you haven’t even attempted to address.
Ooh! I just thought of a much better application of a genome database. I said earlier that there are issues with having to wait for diseases of adulthood to manifest in our hypothetical cohort.
Why not instead take an existing multi-generational study and sequence their genomes? Imagine the data you could learn from the Framingham Heart Study, with three generations of CV data! And since everyone there already has a study ID number (totally separate from any other identifying number), we eliminate all the SSN stuff.
I only see three issues with that: 1)a lot of the first generation are dead, so no genomic data, 2) getting people to volunteer their genomes, 3) it’s not all that diverse a place (which has always been a problem with the FHS) so there are a lot of population groups you would miss.
There is only one way to make a database like that secure: put it on a computer that is never connected to the internet. That’s how Los Alamos protects the computer that runs their bomb explosion simulations. But if you make it secure in that fashion, then it becomes a great deal less useful. If you try to make it useful to other researchers by putting it on the net, hackers will find a way to penetrate the security.
Admittedly, at the database sizes we are discussing, FedEx has more bandwidth than the internet. But you still have to pay the army of data technicians who input the queries and transfer the results to the hard drives which get shipped to researchers. And you have to pay FedEx (or UPS, or USPS, or whatever courier service is involved) to ship the disk to its destination. That gets expensive really fast–most likely, proposal budgets would have to include these things, which means less money to do the actual research. And when all is said and done, you have to make sure that the hard drives you are shipping to researchers all over the US (if not the world) don’t have any identifying data that somebody might accidentally put on a networked computer.
There are lots of problems like this that have to be solved. Feel free to argue that we (meaning NIH, NSF, and other funding agencies) should fund the research to address these issues–as noted in various posts, some of them have practical application in other areas, too. Don’t waste our time and money collecting the data before you have solved these problems.
This is a truly breathtaking level of cluelessness. No, you don’t get handed entire genomes for random conditions, because then – in your fantasy system – you can start doing reverse queries and not just deidentify them, but also piece together their complete medical histories in whatever noble land where adults have no control over their PHI.
There’s more, but I’m on deadline. Somebody with more time might want to look at how Denmark handles the information that it keeps; e.g., here one finds the following:
“Access to Danish registry data and data linkage requires authorization by the Danish Data Protection Agency (Datatilsynet) and in some cases, additional authorization from the Danish Health and Medicines Authority, typically when medical charts are to be accessed, and/or authorization from the National Committee on Health Research Ethics (Den Nationale Videnskabsetiske Komité) if biological specimens are to be used or if living persons are to participate in clinical studies. The Danish privacy laws on the use of personal data are stipulated in The Act on the Processing of Personal Data (Act Number 429; May 31, 2000).[47,48]”
And once again, the slow loris demonstrates he has no clue how things work. Let’s go through these one by one.
1. in a research study, you absolutely cannot be photographed or fingerprinted without your consent. I have provided a few photos for research studies; in both cases, the doctors ASKED me, and assured me that my name would not be attached. Fingerprints were not taken, and very few medical studies would need them anyway.
2. Doctors do not accept credit cards, unless they’re named Sears, Byrzenski, or Gordon*. There’s this thing called insurance, or, you know, Medicaid. In any other country, there are national health systems. As for health information, it’s securely filed away. Yes, a sufficently motivated hacker could get into those files, but I don’t see why they’d want to, as it’s generally dullsville and only of interest to other doctors. There’s a huge glaring difference between office files and big fat database sitting there on the web.
3. Most of the time, you don’t order drugs yourself. You phone or click for a renewal and the pharmacist orders the drug. Unless you’re dealing with some fly-by-night compounding outfit, and like with Gordon or Sears, one could argue that anyone dealing with those outfits is already being robbed anyway, so they shouldn’t be surprised when additional chicanery occurs.
4. No one ever claimed facebook was confidential. Even Facebook doesn’t claim the pages are confidential, and there have been numerous warnings to that effect. If your facebook page has info on your medical condition, that’s because YOU put it there.
Anyone else like to weigh in?
I’m sure whatever involvement you have with data entry has interesting challenges, and requires some problem-solving skills, but you really are doing poorly at communicating. Which may be why I used the term non-sequitur earlier; I don’t see how any of this is related to whether my proposal is a good idea or not.
I can’t “address” some problem if you will not tell me what it is.
Anyone can access some random human dna in any number of settings, and, again, accepting your unlikely premise arguendo, learn something about some individual. In fact, the workers in the hospital where our cohort babies are born could scoop up all kinds of bodily fluids, and run a parallel zebra database. Wasn’t there a Thomas Pynchon novel along those lines, with a parallel post office? Is this the kind of thing that keeps you from sleeping at night?
[email protected]: Well, my doctor takes credit cards (for the co-pay). And yes, most medical files are super boring, but imagine if you wanted blackmail dirt on someone? “Does your wife know about your new case of herpes?” “Anti-retrovirals? Your mother will be so dissapointed.” Etc, etc.
There is also a law in the US that prohibits employers from sequencing you because that could be used to discriminate against you in hiring (or firing). “Oh, you have a very high likelyhood of cancer? We can’t have people here who won’t be able to give it their all…clean out your desk.”
And it is always possible that as our understanding of the human genome expands, what you can tell about a person from their genome will expand too. I’m not thinking GATTACA bad, but squicky.
This one is too good to pass up:
Hi, my name is Z., and I don’t know how Social Security numbers are issued.
The previous one seems to have vanished into the ether, so I’ll try again with minor variation.
Oh, Christ, not this sh*t again.
Double fail by competent style authority. Clearly, Z. still fails to grasp the different nature of its original error.
Why? That’s the easy part, genius. What continues to perfectly elastically bounce off your skull is everything else.
The Crying of Lot 49
That’s what I was thinking of. There is a vast conspiracy of neonatal nurses tucking away bits of bodily fluid to create a Parallel Zebra Genomic Database.
Or at least, that’s what people should be more worried about than the actual Zebra Genomic Database, which is under strict (Canadian) government supervision [What could be more non-threatening than that, once they get rid of the current buffoons?]
It was called The Tristero Conspiracy, and it was either a parallel postal service or a paranoid fantasy.
Ah, I had previously missed the part where Z.’s fantasy involves the entire birth cohort being born in a single hospital. You can’t make this kind of desperately confused flailing up.
Narad: Ah, I had previously missed the part where Z.’s fantasy involves the entire birth cohort being born in a single hospital. You can’t make this kind of desperately confused flailing up.
Not to mention the outliers who get born on the road or at home. What sort of country just has one maternity hospital? I think even Luxembourg has multiple hospitals.
Justatech: And yes, most medical files are super boring, but imagine if you wanted blackmail dirt on someone? “Does your wife know about your new case of herpes?” “Anti-retrovirals? Your mother will be so dissapointed.” Etc, etc.
Yeah, but most offices are fairly well protected, and there are a lot of penalties for people who try that. Wasn’t there a case recently where the law came down really hard on someone who was trying to sneak records from Planned Parenthood?
Yes, there are differing state laws regarding Social Security numbers as unique identifiers. I’m not clear on the federal law for medical records since I think SSN’s are still being used as such in many podunk medical practices.
However, anything using SSNs would be a no go for me. A few years ago someone from the VA took a laptop containing veterans medical data home* to work on during off-hours. It was stolen and the way the government was able to locate me to let me know this was by using my tax records with SSN. It was almost 20 years since I had been in the military.
Moral of the story. TMI applies here. Too much information is at hand with SSNs already. No need to add more.
*Which I totally disagree with but totally understand if they are anything like business where they expect you to work in your sleep.
**Apologies if I posted this here before; my memory fails me, and often.
Thank you, malia, that’s the kind of answer I was looking for. I’ll be off to visit family later this month, so it looks like a few of us will be spitting into test tubes.
There is another way – you can connect it to an internet, just not the Internet. See https://en.wikipedia.org/wiki/SIPRNet
for an example. There are several others.
I made quite a good living helping to build a few private networks back in the day. But your other point is very valid. We had a saying ‘never underestimate the bandwidth of a truck full of mag tape’. There were times that a private armed courier with a pouch was faster and cheaper than 3 or 4 years of circuit charges. Ping times sucked.
Another point about Zebra’s silly plan – most people don’t have genetic anomalies, and beyond the number needed for a really, really good base line, I could see how, maybe 60 to 80% of the database would never need to be used (numbers pulled from my backside, maybe someone can offer up a better range if necessary). All the superfluous data would just sit there sucking up resources, both in the acquisition and the storage. The problem is that you don’t know which part you can leave out. Much better to sequence what you need as you need it.
I’m of two minds about the other half of his plan – his medical records database. Clearly, it violates the HIPAA laws, and mostly I think HIPAA is a good idea. But I also like to go out and spend an afternoon poking holes in a piece of paper (and sometimes tasty animals) with a few friends. But every once in a while, some nut case will decide to shoot up a church/ restaurant/town/policeman/school (and yes, you do have to be mentally broken on some level to use other humans for target practice). Everyone agrees that the mentally ill shouldn’t be allowed to buy guns. All we need is to have the government keep nice list of the ‘afflicted’, and Zebra has given us the rational (such as it is).
If by “mentally broken” you mean human, sure; I don’t see how it has anything to do with mental illness necessarily, though, especially given that people with psychiatric conditions are no more likely to be violent than “normal” people.
It’s an uncomfortable fact that pretty much any human is capable of atrocities, given the right set of circumstances, etc. The “just plain folks” soldiers in the Wehrmacht were just as awful, in fact, as the really committed SS guys. Then there are “crime of passion” impulsive murders, etc., which are also usually committed by “sane” individuals with a gun at hand during a critical moment.
I mean, if we’re going to profile people who buy guns based on how likely they are to commit a violent act, we shouldn’t be selling guns to men, actually.
Johnny @150. Here’s a better answer
Everyone agrees that the mentally ill shouldn’t be allowed to buy guns.
That’s all very well, but it gives the power to someone like me to decide if you’re mentally ill and to take away what was purportedly a constitutional right.
I’d estimate that, at a minimum, about fifty percent of the population will opt out when they hear the part about all their personal health information being sent to the government.
[email protected] 1
Here’s what I think is the most daunting confidentiality issue. The genome itself can be personally identifiable (i.e. by cross referencing against public genealogy databases). Giving researchers raw sequences is no go, and only giving them certain markers is risky because it’s not entirely certain what can be used to identify someone. This is an issue that already exists with our current data, as evidenced by how researchers were able to expose the identities of participants in the 1000 Genomes Project (see #130).
Right now there is no good solution to this problem. Homomorphic encryption is one potential solution. It allows comparisons to be done on encrypted data but with current technology the overhead is too high for it to be practical for large datasets. I’m sure there are other technologies being worked on but the problem of protecting genetic information is not solved.
I’ll say it again, if you want to talk about confidentiality issues it would behoove you to at least skim the article I linked to in #123. Here it is again for convenience: Routes for breaching and protecting genetic privacy.
Z.’s grandiosity is again noted. Of all the things I recall of Lot 49, though, this isn’t one of them.
Anyway, regarding my immediately preceding comment, maybe Z. didn’t mean “sequencing” at all, but rather “annotating” or something.
Or maybe he had no f*cking idea what he was talking about in the first place.
The Zebra Genome Project has already offshored its implementation. The fashion in which it’s “done so” is pretty embarrassingly funny, but I have get back to work.
Somebody with more time might want to look at how Denmark handles the information that it keeps
I guess that Icelanders hold the record for the fraction of the population with sequenced genomes, but IIRC they sold all the IP rights to a private company, so the company can do all their research away from the Intertubes and never have to worry about sharing data outside.
Wasn’t it the Mormons? IIRC they are busily collecting genealogical data so they can baptize everyone right back to Adam. Or is that a myth?
“This is an issue that already exists with our current data,”
If you were sincerely trying to answer my question (rather than offering a non sequitur because it’s a zebra suggestion and you want to be oppositional), you would explain why you think my project changes anything in a negative rather than positive way.
“Giving researchers raw sequences is no go”
Then, how are malia and Retro Pump doing their research?
How is what I suggest different from what people are doing now?
I would argue that my approach provides more, not less, security, for the reasons I explained.
So, unless you can actually demonstrate a net negative security effect from my approach, I will consider the security issue closed.
“I mean, if we’re going to profile people who buy guns based on how likely they are to commit a violent act, we shouldn’t be selling guns to men, actually.”
As a man, who has fired some guns in his time, I couldn’t agree more.
“I’d estimate that, at a minimum, about fifty percent of the population will opt out when they hear the part about all their personal health information being sent to the government.”
Especially the people on Medicare and Medicaid.
Right. Like I didn’t think of that.
Medicare and Medicaid are covered by HIPAA. What you’re proposing would require a waiver of that protection.
I do appreciate the concession of every single other point I’ve made however.
Speaking of which:
Do you mean people like malia and Retro Pump? Because they’ve both suggested otherwise. As in:
And (following up on malia’s “IF (and only if)”:
You haven’t addressed the substance of any of that, except by straw-manning. As in:
Blockquote disaster. Here’s that last part again, starting with malia’s quote:
How much clearer could that possibly be? I mean, what use do you expect people to make of data they don’t yet know how to interpret?
^^That second one’s a serious question. Please answer it.
It is the Mormons.
My cousin, who is researching our *interesting* family and her father’s (perhaps) even more intriguing clan, has used their resources for ( at last count) 3 countries- US, UK and Ireland- amongst other free on-line material she found. As we already knew, both families have been in the distilled hootch business ( gin and Irish whisky respectively). There are other businesses we knew about ( haberdashery) but some oddities ( a fruit merchant?) There may also be someone ( her family) who went to NSW .
So far though, no Norman Conquest.
And about Denmark:
a prof I knew used it for research about depression in families c. 1975-80.
OK. We’ll use France or UK instead of Canada, and get about the same sample size.
Although I think you are incorrect that half of Canada’s citizens would be uncooperative. I’ve already said that USA is a bad choice because it is full of paranoid kooks, but nations with rational universal health care approaches I doubt would have such a rate of rejection.
To be clear, the Danish material didn’t involves the genome then.
AND I don’t think that the Mormon story is a myth.
My atheistic ancestors would not be thrilled.
You haven’t made any points, so I can hardly answer them.
Cherrypicking quotes isn’t “making a point”.
As far as I know, the only commenter who has made a cogent response is Retro Pump.
And as far as I can tell, the only point about which we disagree is how quickly we would convert the data contained in the samples to digital format. I would love to have RP answer my query on that at #112. But you nor apparently anyone else has enough expertise to do that, as best I can tell.
Iceland, people, Iceland:
Malia says that they don’t yet know how to appropriately interpret the data. That’s not cherry-picking. It’s her main point. She’s stated it concisely twice. And she’s also explained it at greater length twice.
You’ve ignored it. You’ve also misrepresented it by characterizing what she said as being about eye-strain, noisy data, and slow processing, none of which she mentioned. And when that’s been brought to your attention, you’ve mischaracterized the objection as being about chickens with their heads cut off.
I asked a question that you absolutely can answer. It is this:
What use do you expect people to make of data that they don’t yet know how to interpret?
This is also not cherry-picked. And it is a valid point:
You would not be saving costs on presently ongoing research by spending money on something that researchers presently don’t need and can’t use.
That’s a significant obstacle. To most people, it would be a dealbreaker, in fact.
How do you propose to overcome it?
Ann, I hesitate to respond quickly in case you are going to have one of your sequential commenting fits and stuff will be crossing on the wires, but anyway.
Look at the reference to the Icelandic study. Look at the number of “working geneticists” who apparently thought it was a good and useful idea. Go figure– is it some peculiar Icelandic genetic derangement that caused such agreement, or is it just scientists who want to move their discipline forward?
Now, I’ve dealt with the funding issue more than once, so you will have to go back and read more carefully. The data will be provided free or at a nominal fee to qualified researchers, and it will not take funding away from existing funding agencies. (And I still haven’t seen any evidence that operating budgets would be unmanageable at all.)
But as to malia and RP– the data isn’t for them specifically, it is for an expansion of research, so that people can figure out “what to do with the data” sooner. That’s how basic science works, by lots of little grad students working on lots of little projects and figuring out better ways to do things and contributing to the whole. Malia and RP can go on with their 20th-Century approach, and we will see which bears more fruit.
Another 28% said that it would probably be the case.
Additionally, seven in ten Canadians think that protecting the personal information of Canadians will be one of the most important issues facing the country in the next ten years.
That was as of 2013.
Per an even more recent survey, a majority of Canadians are also just distrustful of government generally.
What Iceland where their project to collect genetic data was ended thanks to precisely the kind of privacy issues you have dismissed out of hand?
Since the information would not be available to the participants, so they couldn’t be asked for it, what possible relevance does this have?
Once again, like capnkrunch, you are providing non sequitur with respect to the specific plan I am proposing.
So the solution is to expand our data gathering efforts when there are already confidentiality issues with our current data? How exactly does that make anything better?
Yup, I was wrong here. It’s this data that can’t be publicly disclosed, not that it can’t be disclosed to researchers. Ideally, even researchers should not be accessing personally identifiable information. It’s a tenuous system already. Greatly expanding the amount of data will only worsen things.
For one, scale. On small scales some security measures work fine. For example there have been successful implementations of homomorphic encryption on datasets of 20 or so. Resource consumption scales with size. We’ve been telling you this the entire thread and somehow you still can’t grasp the concept.
More importantly, it doesn’t matter whether it is different. The current scheme has issues. You even agreed at the beginning of the post.
Look. I’ve given you plenty of references and encouraged you to read them multiple times. At this point it’s clear your ignorance is entirely by choice.
Very interesting. o
Further @#168 —
It’s already failed.
They actually proposed it. But they have just as much opposition to governmental invasion of privacy there as we do here, and it’s much better organized. Plus they have a very bad track record when it comes to keeping patient information private.
So the idea met with a lot of opposition due to widespread privacy concerns and expert opinion to this general effect:
So it never happened.
They do have a databank of half a million people, though.
Do you know what a non sequitur is? There are legitimate confidentiality concerns with current datasets. There are proof of concept attacks to deidentify genomic information. Starting a mass data collection campaign without solving these issues first is irresponsible in the extreme. And it makes even less sense to put the PHI of so many people at risk when you have not one but two experts in the field telling you they don’t even need the data.
My point was that almost half of the Canadian population has indicated that it would refuse to participate in the collection of genetic information due to privacy concerns, in which case they wouldn’t be participants.
This is relevant to your argument that “paranoia” in the US wouldn’t be an obstacle in countries where the government provides universal health coverage.
That’s untrue in both the UK and Canada. There are very high levels of distrust and protest regarding the databanking of personal health information generally and genetic information in particular in both.
The direct transmission of genetic information by health-care providers to other persons or institutions is forbidden, apparently without qualification.
So strike three. It wouldn’t work there, either.
Any other ideas?
There’s a 186-page .pdf summarizing pertinent national regulations worldwide here, if it helps.
Yeah, I think you are dodging the question.
Is it a bad idea to expand genetic research?
Is it a bad idea to expand genetic research using Zebra’s Database?
And you can’t say “both”.
The fact that you can’t directly respond to that question is why everything you’ve written is a non sequitur. It has no relevance to whether my way of doing it is better or worse for security.
If a researcher has N genomes, the security issue does not depend on whether she obtained them from the central database or by taking samples and sending them out for sequencing or doing the sequencing in her lab.
The fact that you refuse to acknowledge that demonstrates that you are not trying to make a sincere argument.
Also, wrt to this:
I’m not sure I even understand what you’re saying.
If they wouldn’t be asked to provide their personal health and/or genetic information, how could it be legally obtained by researchers?
If you’re suggesting that the government just hand it over, it’s precisely the fear of that kind of thing that makes almost half the Canadian population unwilling to get genetic tests.
So it’s still a relevant point.
And to avoid any purposeful misunderstanding, that’s “N genomes in digital format on her computer.”
I’ve already indicated that I think my method is more secure than either of the others in total.
Please explain how you plan to overcome the widespread popular objection to genetic databanking in the UK and Canada.
Please also explain how you plan to overcome the legal obstacles to creating such a databank in France.
Or, alternatively, please suggest a nation where your project would be feasible.
Whether or not data is secure actually does depend on where it comes from, who collects it and where it goes.
Both you asshat. Until there are better privacy protections in place I think any mass collection of genomic data is irresponsible. Do I think we should stop using what data we have already? No. Do I think it would be reckless to collect additional sequences on a large scale? Hell yes.
Scale, scale, SCALE! you twat. There’s a difference between exposing a couple thousand people’s (about what the biggest datasets are now currently) PHI and exposing hundreds of thousands. Most of the current datasets were collected before the problem was well known. Those reseaechers had an excuse, no one knew the privacy imications of collecting genomics data. You have no such excuse.
There’s also a difference between privately held data (i.e. a researcher sequences genomes in her own lab) and shared data (i.e. a database that needs to be accessible by many different organizations). The attack surface is much greater in the latter scenario.
Using our current datasets, people can already be deidentified using their genome. Why the hell would we want to expose hundreds of thousands of more people to that? The fact that you choose not to answer that question means you are just being deliberately obtuse.
Ah yes. I remember you saying that but you never explained why. I can’t imagine why you think that given that the attack surface is much greater. Please do enlighten me, great security guru zebra.
This thread is getting very tiresome. Barring a reply with some substance from zebra I’m bowing out.
No. It’s that you’re wrong to say that.
The national Icelandic database was a government-backed commercial endeavor, and it was not supported by scientists or doctors:
Got that? The national database was not the one that was going to be used for diagnostic/therapeutic research.
And it was opposed by the people who do such research:
That’s why the project failed, just as it did in the UK.
As to this:
You have yet to produce an example of so much as one, single 21rst-century thinker who wants such a database besides you.
All the attempts to start one have been corporate-government partnerships the (ostensibly) scientific justifications for which were decried by scientists in terms such as “complete balls.”
I’ve read what you said. That the data would be provided free does not save costs unless people need and want but can’t afford it. And there’s not a whisper of a hint that that’s the case. On the contrary, all the evidence suggests — or even states — that such a database would be useless for research purposes. For example, as the article linked above says:
Got that? The kind of research you have in mind is presently impossible. So you would not be saving costs on it. It can’t be done.
That leaves you with the expense of compiling a database to make useless information freely available.
And if the money would not come from existing funding agencies, where would it come from? Setting up and staffing a new agency would just add to the costs.
Given that zebra can’t name a country where his idea is feasible or produce any evidence at all indicating that there’s any need, wish, or use for his project I think that unless he does, its uselessness has been too fully demonstrated to require further proof.
So I too am bowing out until then.
Take it away, z. The field is yours.
If you oppose expanding all genetic research because of security concerns, that’s fine, but you could have said that way back. I happen to disagree with your risk assessment, and that of the numbers of people ann cites as having concerns. It is often the case that people misperceive relative risks; I think that has been discussed here in the past.
I think I’ve expressed my take on this but lets see if I can briefly revisit it. There are “hundreds of thousands” (millions, actually) of medical records extant in many locations– not genomes, but actual medical records with easily understandable information about individuals and their health issues. So the “attack surface”, if I understand you meaning, of my database represents a trivial relative risk.
If there are any Willie Suttons out there looking to exploit and monetize medical information, they would be attacking Medicare and Medicaid, the VA, and the private insurance companies.
Why would they go after this not-very-liquid asset, which requires much more work before they can identify individuals, and which is tiny in “scale” compared to what they can get from any of the other large institutions, with many more avenues of access?
I’ve pointed this out in various forms more than once. I’ve also pointed out that the actual mechanics of my project keep things as arms-length as is possible, which ann is clearly befuddled by, and she needs to review how I’ve said it would work.
Relative risk– people just aren’t very good at thinking these things through.
I already mentioned the UK which has a large government-owned database which uses phenotypical grouping.
The Iceland study I talked about is here:
Since your link doesn’t work, I am not sure what you are referring to.
That study only did whole-genome sequencing of 2,636 Icelanders, not the whole population. If it demonstrates anything, it therefore demonstrates that the kind of database you’re proposing isn’t necessary for genetically homogenous populations.
However, as it happens, your idea was attempted in Iceland. And it fell apart for the same reasons it did in the UK. Scientists and physicians opposed it, and there were privacy concerns.
The busted link is here:
^^As you might gather from that, the public was not initially opposed. This was in 1999. However (as explicated at krebiozen’s link) the Supreme Court put the kibosh on it in 2003. And concerns about privacy have greatly increased since then.
krebiozen’s link, in the event tht you need it, is here:
SHORTER VERSION: Iceland didn’t need a national genome database to do that study. And when one was proposed, it failed due to the objections being raised here and everywhere else in the world that a corporate-government partnership tried to rustle one up.
In all cases I’ve seen so far, scientists and researchers didn’t want or favor the project. That includes Iceland. This:
is therefore flatly wrong. That study sequenced fewer than 3000 people. It did not require a national database of genomes. And the working geneticists who did it apparently were perfectly well able to move their discipline forward without one.
I’m not the one who suggested that a study in which working geneticists were able to reach conclusions about the population of Iceland by whole-genome sequencing a mere 2636 people constituted proof that scientists wanted and needed a population-wide genome database in order to move their discipline forward.
First, the UK project here
is alive and well, or at least its website is.
There was a commercial Iceland project that was sold off multiple times after not making money, but that is nothing like what I am suggesting, and I don’t know what connection there might have been with the Nature paper I referenced. That was the first thing I came across.
But as I said earlier, you don’t seem to grasp the concepts very well.
I am not suggesting that the entire population of Canada should be included in the database.
Nor have I ever suggested that any individual study would use the entire database.
I already went over this multiple times and refer to these ideas as grandiose strawmen, which accusing me of suggesting is ridiculous even by the standards of this blog’s comment threads.
You obviously haven’t comprehended what are essential elements, like the fact that the genetic information is completely isolated, including from the people contributing it.
You seem to think that if half the population refuses to participate, we can’t collect the data from the other half. It’s like gay marriage, ann, if you don’t believe in it, the law doesn’t require you to participate, but you can’t stop other people from doing it. And anyway, you and capnk keep complaining that there is too much data– ok, you should be happier with half as much, right? That’s something around 190K genomes.
So if you just want to keep repeating this stuff which is either wrong or irrelevant, I don’t know how to further help you.
Yes, I know. The one that failed was the one that was like what you’re proposing — government sponsored, all citizens, cross-referenced through the NHS, etc.
Read the link.
It’s the same project. The company is called DeCODE. It made a tremendous amount of money hyping itself, then crashed, leaving individual shareholders with trash.
You may now troll yourself.
I’m bowing out.
IDK, zebra, maybe you need to ask those here to look at your genome db proposal while on…weed.
You’re beginning to sound like a communist. “Yeah, communism historically has caused untold misery but you just haven’t tried my communism yet.”
And you then you take it personally that they are against it because you mentioned it. You know, I originally thought the scientists here would take a database of this information if it auto-magically dropped in their laps. Not that this was ever in anyone’s power to do anyway, but they made a good case why they wouldn’t even want that. Data is not information. Throw in all the other unpractical reasons and this idea is dead in the water.
But don’t worry, it always was because there is no will, money or defining interest in it yet. The efforts are better spent elsewhere. Like, maybe, feeding people whose genes may or may not hold ticking time bombs. But as I have learned in resisting the never ending worry about my hypertension and cholesterol by MDGPs and naturopaths, it doesn’t matter to me what is going to kill me in ten years if I can’t survive in the present time.
OK. There’s no “there” there. I’m following ann. Some parting comments:
-I’d challenge you to reexamine this belief that “I don’t agree” constitutes a valid argument.
-Comparing relative risk of storing personal medical data and of centralized collection of research data is meaningless. I’ll leave it as an exercise for you to figure out why (hint: you’re using the wrong risk assessment tool).
-The prevailing (and most prudent) paradigm in information security is to minimize all unnecessary exposure, not that additional exposure is ok so long as it is trivial compared to another.
Which, of course, is why Z. proposes a monolithic government project. Add in completely whimsical, magically free data access, and Problem Solved!
Oh dear, oh dear, Z. has demoted poor Malia from “real expert” to “20th century thinker.” How will her NGS lab go on?
(For those keeping score, I’d rank the juxtaposition of these two quotes as being a genuine non sequitur.)
It’s truly amazing the amount of hubris that one person who has no practical or theoretical understanding of any of the scientific disciplines involved* – indeed, who can’t even recognize a basic arithmetic error – much less the issues surrounding privacy and ethics in human subjects research can single-handedly generate.
* “capnkrunch 136, I’m sure whatever involvement you have with data entry has interesting challenges, and requires some problem-solving skills….” Sweet Jesus.
So the issues with the “birth cohort genome database” are:
1) We can’t get the medical records we would need to know if any given gene actually has a relationship with a disease.
2) We don’t have the processing bandwidth (technical or human) to actually analyze the data.
3) The cost of storing 1 cohort’s data is about $1million. Yes, I know Z said we could get it by not making F35s, but since when has DoD ever given up a single penny of their budget to someone else? See bake sales for bombs.
4) It would take generations to be able to learn anything about adult-onset diseases.
5) Acquiring the genetic material to sequence requires both an approved study and the informed consent of every single person in the cohort (or their parents, and given how protective people are of their children I think that a 20% enrollment rate would outstanding).
6) We already know that we cannot keep the data private, as is required *by law* for any human-subject study in the US (and most other countries).
So aside from all that, it’s a brilliant plan.
There was a commercial Iceland project that was sold off multiple times after not making money […] and I don’t know what connection there might have been with the Nature paper I referenced
Perhaps you should have read the paper.
That list has some omissions IMHO, but I’m certainly not going to fault anyone with losing interest in this Z. misadventure.
Not a Troll @198: I can’t speak for anyone else in the commentariat, but I suspect it would take LD50 levels of tetrahydrocannabinol to get me high enough to think that our equine friend’s database might be a good idea. Several commenters have raised a number of objections, any one of which should have been sufficient to sink the project, yet he remains as impervious to rebuttal as an old-fashioned vinyl record. (Or CD, in case you are too young to remember vinyl records.)
He reminds me not so much of Communists, as of political figures like Reagan and G. W. Bush. They would get certain ideas in their heads, and no amount of evidence to the contrary could get them to abandon those ideas. He seems to disdain those of us in the “reality-based community”, to use a phrase coined by an official in Bush’s administration.
Indeed, and not just for the punchline. Reference 18 looks totes kewl, as well, even if it is seven years old.
But this episode highlights, for me at least, Z.’s serious, ah, “oppositional” take on “20th century” notions regarding the order of carts and horses.
This one seems to mark the beginning of Z.’s realizing that Malia was getting uppity, or something:
Then again, the whole single-cohort routine does at least do away with the hoary 17th century dualist notion of comparing the size of the cart and the horse and thinking about maybe putting one on top of the other instead.
For those of you who think that z*bra is unable to learn may I point out a fact which TOTALLY destroys that hypothesis: z*bra changed from writing “non-sequitur” to “non sequitur” midway in this thread! Nothing else has penetrated its addled pate but s/he is demonstrably better informed than when this conversation began.
And yet, he’s dazzled unto blindness by the 21rst-century-ness of an idea that people have been using to fleece suckers, marks and governments since 1999 — eg, DeCODE — precisely because it elicits that response.
You’re leaping to the assumption that he’s figured out why.
[…] Orac is on vacation recharging his Tarial cells and interacting with ion channel scientists, as a good computer should. In the meantime, he is […]
A note on the quality of the responses, ann at 196
1) This is government sponsored and government owned and obviously they have access to the NHS data.
2) If anyone is smoking or imbibing something, it seems to be ann, who still hasn’t figured out– despite the fact that I just said it– I never suggested anything like “all citizens”. Grandiose strawman.
And various others presenting various strawmen and gish gallops too numerous to mention.
The good news is that RP, who is actually qualified, “got it”, and that makes it worth the trouble.
You wrote something so idiotic it annoyed a long-term lurker enough to delurk and write:
Yet you somehow interpret this to mean RP agrees with you? Astonishing.
Question for non-zebra posters, what would the ethical implications of such a database be? I’d imagine that collecting medical information past the age of consent would require a new consent from the patient themselves but at that point could they request that information gathered with their parent’s consent be deleted? It seems a little questionable to me to allow a parent’s consent to be enough to collect the information and store it for an arbitrary amount of time. Maybe if it was collected with parent consent and couldn’t be used until the patient was at the age of consent and provided it.
Also, forgot to say it before, thanks malia and Retro Pump for chiming in. If nothing else, zebra at least stimulated some new people to delurk. I agree with Narad’s sentiment in #110.
“Yet you somehow interpret this to mean RP agrees with you.”
And yet another bizarre strawman, based on yet another cherrypicked quote.
Thanks for demonstrating my point.
I could have done without the high horse insults but I actually learned a great deal from this discussion. And for those who had a fair grasp of the subject before, I hope it strengthened your thinking on it.
Yes, I know.
But you’re proposing a government-funded DNA database of the entire national birth cohort for one year, with a view to creating a resource that’s large and nationally representative enough to provide researchers with the subpopulation sample of their choice.
The UK proposed something equivalent. And as I said @#179, it failed.
Read the link I provided. Here it is again.
The project you’re linking to now is also a government-sponsored large DNA database. But its exclusive research focus is cancer and rare disease. It recruits exclusively from a pool of patients who have already been diagnosed, plus their relatives.
Fergus mentioned it back @#115. And you ignored it. Because it’s not remotely like what you’re suggesting.
The entire point of your project is to assemble a database that represents something like all citizens — an entire birth cohort.
You directly admitted that it was akin to a population-wide*** project yourself. (“Iceland, people. Iceland.”)
***Or actually what you wrongly believed was one. But same difference.
And zebra @#209:
“Equivalent” is just your (yet again) ridiculous strawman. But putting that aside:
“The UK proposed something equivalent. And as I said @#179, it failed.”
In 1992 or so, the Clinton administration tried to institute a universal healthcare system. It failed.
So I guess that for you, that would be evidence that universal healthcare is a terrible idea. And that no progress could possibly be made towards that goal, under different conditions and incorporating lessons learned from the first attempt.
As I said, strawmen, non sequitur, gish gallop, yadda yadda.
After they somehow are able to a priori define meaningful subpopulations based on combinations of subsequence queries.* Unless, of course, the “results” are going to require genetic screening of everybody to have clinical utility.
* Again, nobody’s going to be handing out entire genomes willy-nilly based on fishing-expedition–level proposals.
There look to be a lot more questions than answers. Pages 323–24 here (PDF) are on point. (The Elger & Kaplan is here; “supra note 133″ in n.143 is a typo for 135.)
One key issue is that the sample donation, by definition, does not provide any benefit to the donor, although it does entail ongoing risk. For that matter, if things go swimmingly, the end result will likely be a private company reaping a handsome financial reward from all the donated samples.
^^ In fact, now that I think about it, Z. seems to have been awfully short on that “clinical utility” thingamabob in general.
Since it’s your bright idea and you’re invoking them, what would those “lessons” be?
In the UK the Nuffield Council on Bioethics has done a lot of work in this area. It is difficult for many reasons, for example because a person’s genome also gives information on their relatives’ health, thus impinging on their privacy too. There are also issues around informed consent, as we don’t really know yet what information might emerge from the genome in the future. Some embarrassing or even criminal characteristics might be connected to some constellation of genes – imagine if someone found an association with pedophilia, for example (as unlikely as that may be).
Better politics. Duh.
Death panels, keep the government’s hands off my medicare (ann might have been one of those), your giant corporate employer will drop your coverage, blah blah blah…
Here it’s some kind of ninja codebreakers….
1. Accessing the digitally encoded genome of a small random part of the population, either by getting into a secure database (or I think someone was worried about Fedex? trucks carrying hard drives being hijacked?)
2. Figuring out the medical information contained in the genome, which our experts say is excruciatingly difficult.
3. Determining who the individuals are from the genome.
4. Somehow turning all that into a profitable venture….
5. Instead of paying off some minimum-wage data entry person to get them access to medical records at the insurance company.
So yes, just like one set of irrational concerns was overcome, good politicians convincing enough people not to be stupid is what you need. Just like with the vaccination exception problem, as an even more recent example.
I think we are up to ~221 posts trying to convince you not to be stupid. It’s no easy task when we are confronted with a D-K sufferer who is showing all signs of being in the top 1/10 of 1% of all those so affected.
As well you might, since it is equivalent, by your own admission.
Why, no. It would not. Because there’s abundant evidence that universal healthcare is a good idea, which can and does succeed time and time again.
But speaking of gish-galloping, cherry-picking, non sequiturs, and straw men:
As I’m sure you know, when I first raised the point it was in the context of pointing out that there were, are and have been such numerous and widespread objections to your tired old “21rst-century” idea everywhere in the world that anyone’s ever tried introducing it for the last sixteen years that that’s how long it’s been failing!
And as I said: Ha.
[email protected] and [email protected]
Thanks. That’s exactly what I was looking for. I made it through Narad’s links and have been skimming Krebiozen’s. It definitely seems like an interesting area in bioethics. Interestingly (to me at least) is that there doesn’t seem to be any issue with parents consenting to their children’s data being stored, as long as the child has the ability to opt out once they reach the age of consent.
I thought that would be more of a problem, especially in light of what Narad said:
I always thought the principle behind allowing parents to consent for children was that they could make decisions that are best for the child but the child wouldn’t make for themselves (i.e. no child would choose to get injections). In this situation where the benefit to the child is more unclear (there’s benefits for society but not necessarily the individual) I had figured the same principle might not hold. I’m not an ethicist though and it seems like the experts think otherwise.
You would have done yourself a favor to pretend that you had never seen the question.
I’ve already openly wondered here what on earth your college or university training was in; in point of fact, I often have a hard time convincing myself that you’re not a none-too-bright high-school student.
I confess that when I pointed to the Iceland red herring back in comment #158, it seemed too much to hope for that anyone would take the bait.
That’s my comment at #222 – I do hope you aren’t referring to me.
You did kindly point out that bear trap, but zebra just blundered on in anyway.
I do find it hilarious that zebra is now simply dismissing the enormous problem of privacy that hundreds of the best minds in genomics and bioethics are currently wrestling with.
Krebozien @ 229
My comment was to z*bra’s at 223.
Memo to self: check for typos before noting someone else’s stupidity.
I was really surprised that an experienced drive-by troll like yourself would pitch me such a softball.
Although if it’s better politics he’s looking for, I’d suggest business-friendly Qatar. It’s an absolute monarchy without much in the way of civil liberties and a partnership with Weill-Cornell Medical College. The politics couldn’t be any better, really.
There’s even a regional precedent of sorts:
Kuwait, people. Kuwait.
Why would any mature, rational (non-paranoid), non-Authoritarian adult do anything but dismiss this kind of silliness about “ethics” and “privacy”?
I also don’t pay much attention to the ruminations of very bright, very well educated Roman Catholic theologians, who are equally discussing meaningless concepts. Is that also “hilarious”?
To the extent that society is actually democratic, decisions can only be made on the basis of group self-interest. Crudely put, something like greatest good for the greatest number.
(I have no moral/ethical position like Utilitarianism, that’s just a familiar form of expressing a pragmatic approach.)
Sorry. That link should have been:
If you’d like to ask some who don’t, the names of the ones who serve on the advisory panel of national leaders in medicine, science, ethics, religion, law, and engineering over at the Presidential Commission for Bioethical Issues are right here.
Ah. The answer to a non-Authoritarian thinking something is unimportant is Appeal to Authority.
Well OK, I guess that’s settled.
Ann, you need to take a breath and think. Once in a while you demonstrate that you can write a complete paragraph that makes sense, but this kind of thing inclines one to not bother reading your comments.
Kuwait: I’m not going to waste my time researching the intricacies of the Kuwaiti political system, but I think it isn’t very democratic. Also, the actual citizens (not foreign labor) tend not to care that much because they are petro-wealthy. But maybe I will be corrected.
You should read about the very long battle to institute Universal Health Care in the US and the forces that opposed it. You should also, again, pay attention and think– 1992 was only 23 years ago, and everyone said UHC was never going to happen. And then there was don’t ask don’t tell, which was considered the best possible compromise at about the same time, and look how far we’ve come with respect to marriage equality. Not one person would have predicted where we are today.
So, I wouldn’t be so impressed that some attempts to do genetic research and collect data have been blocked; it’s early days.
It is also the case that GMO have been blocked in Europe. Does that mean GMO are a Bad Idea, and they will never be accepted?
For someone who complains about non-sequiturs….Zebra throws out quite of few of them, doesn’t he?
I suspect explaining anything that a “mature, rational (non-paranoid), non-Authoritarian adult” might do would go right over your head. Let’s try this: how would you like it if it your genome was publicized and it was revealed that you had a constellation of genes associated with tiny-penis-syndrome or intractable-pig-headed-disease?
Concerns about privacy are not about being “Authoritarian”, they are about people being rightly worried that their privacy might be compromised by government and big business. That’s the opposite of “Authoritarian”.
What you seem to have a problem with is the idea that some experts know better than you do. Perhaps you have never studied any subject in enough depth to realize just how little you know and the need to have people specialize so they can advise those of us who have other areas to specialize in. I spent two years studying genetics, passed exams on the subject and have read dozens of books on the subject over the years, enough to know just how little I know. I would never have the hubris to accuse two geneticists of 20th century thinking, I am quite certain they know better about that area than I do. You appear to have no concept of the vastness of your ignorance.
You seriously think that privacy concerns about data that can be used to predict a person’s medical future and have real tangible consequences are “meaningless concepts” of a similar ontological status as souls and sins? If you do you are even dimmer than I thought. Once again you equate scientists with other groups, as if science is just like religion, or woo. Have you drunk from the PoMo cup?
What are you babbling about now? Sequencing the genome of millions of people isn’t going to give us useful data, for several reasons that have been explained. That isn’t group self-interest or the greatest good for anyone. Any normal person would have admitted they were wrong and joined a discussion about the best way to go forward instead of embarking on a moronic pantomime as is your usual MO. Not for the first time I find myself seriously wondering what is the matter with you?
You have been told, by two people who work in the field, why your suggestion is utterly impractical, yet your ego won’t let you admit you were wrong.
Not only that, but you are rude to people who have been perfectly polite to you, and accused those whose explanations have been easily understandable and make good sense of being bad at communicating, or of being on drugs. In short you are a twit of the highest order.
No. The answer to blind solipsistic bias is exposure to the thoroughly considered views of others, conveniently available online in .pdf form.
Also, I suggested that you ask them, not that you take their mere authoritative existence as proof of anything.
No, once again you’re using a straw man to avoid addressing the substance of the objections.
You wound me to my very core.
Was the fact that it has compulsory DNA testing your first clue?
Technically, it’s semi-democratic, which — if you ask me — means it’s an autocracy that can afford to allow a superficial and largely cosmetic veneer of freedoms in some areas.
The problem with talking about what the actual citizens care about is that it presumes that the concept of citizenship plays a part in the way Kuwaitis think about themselves, others, politics and/or where they live. It’s not that kind of place.
However, since I’m not sure what you’re saying they don’t care about, exactly, I guess it doesn’t matter anyway.
I’m a politically active person who has not only lived through but actually fought in some parts of that battle, which is — incidentally — not actually over, because guess what?
We don’t actually have universal health care in this country.
Yeah, well. I’m an optimist myself.
But I can see how 23 years of dealing with implacably ill-informed, self-regarding, and overly complacent bozos who evidently don’t even know what UHC is could make a person kind of dispirited.
Not by me.
I think you mean “not one person who knows so little about the fight for marriage equality that he thinks “don’t ask/don’t tell” had something to do with it would have predicted where we are today.”
Because although the outcome of Obergefell v. Hodges wasn’t certain, it was among the reasonably foreseeable options.
Speak for yourself, IOW.
Enjoyable as it’s been batting them around, those analogies are straw men. There’s no constituency fighting for the right to establish a DNA database of the entire birth cohort for one year. There’s no popular political demand for one. There’s no scientific demand for one. And not only that:
Widespread, well-grounded and rational political and scientific objections are actually what’s prevented private-sector corporate interests seeking profits from doing an end-run around democracy by partnering up with advocates for a surveillance state in order to set one up.
So get back to me when you can make a case for it being a cause worth fighting for that’s a little more persuasive than “I can see the future because I say so.”
Or make it now, if you have one.
No. But my point has never been that your proposal is a bad idea because it’s been blocked. It’s that it’s been blocked because it’s a bad idea.
So that’s just straw plus straw.
Your somehow managing to hit it directly into your foot was certainly impressive.
On the other hand, the fact that your grab-bag of language failures includes the word ‘troll’ is merely dirt-common.
^ Actually, it was closer to deciding on a sacrifice bunt with no runners on.
I tried to make this point back in comment 139, but Z. must have been too busy with some other facet of his brilliance to notice.
The issue is with “privacy” and “ethics” as concepts– they are just as vacuous as souls and sins. How is “being unethical” different from “sinning”? In either case, some authority has created a set of rules which don’t have legal consequences. If they do have legal consequences, we call them…you know…laws.
If you are asking whether I would vote for laws that restrict access to my medical information, that’s a different question. Of course I would, as a practical matter. But other than through the interpretations of SCOTUS in the USA, I have no “right to privacy” that I can characterize– I would have to go to court with standing in a specific case to find out what that would entail. Just like I suppose I would have to die to find out if I have an immortal soul.
Does that help with understanding my ontological perspective?
And of course, as should be clear from multiple comments on the topic, my vote concerning genomes would be influenced by what I consider the relative risks of any particular program. Since my clearly understandable medical information, not to mention my penis size, is on record with multiple entities, I consider the relative risk from those ninja hacker codebreaker criminal geneticists I described in 223 to be negligible. If you have concrete evidence (not speculation) to the contrary, let’s hear it.
By the way, if you don’t want to be characterized as befuddled (for whatever reason), or just being childishly dishonest yet again by taking quotes out of context and MSU:
See, I never said any of those thing. I equated ethics with morality. Nothing about scientists. Nothing about the practical matter of access to medical records being restricted. Grow up.
This is what I mean. What does one have to do with the other?
The howlers just keep coming.
If you’re not a member of congress, he probably wasn’t asking you whether you would vote for any laws.
How — how, ffs, how? — is dying like going to court to find out if you’re safe from illegal search and seizure in your own home?
I mean, that you’d have to [do something] in order to find [something] out is really not a unique enough distinction to be categorical.
No. I’m not sure I even see any ontology.
Unless you’re only weighing the risks to yourself, that would be an ethical consideration, by definition. BTW.
Well. That voting-for-laws thing makes more sense if you’re Anthony Weiner.
I suppose you might also be Iggy Pop. But it seems unlikely. He’s very articulate.
And there are, no doubt, other possibilities, too. Still. If you don’t mind my asking:
Exactly what entities keep records of your penis size? And in what form? (Meaning “in what form are the records?” not “in what form is the size of your penis recorded?”)
There’s no such thing as concrete evidence that a straw man you invented yourself @#223 does or does not pose a negligible risk. And can’t be.
A law requiring compulsory universal nationwide DNA testing with no exceptions, no informed consent, and no regulatory limits is so completely incompatible with the basic democratic principle of respect for human rights that any country that has one is, necessarily and self-evidently. not very democratic.
Human rights? Democracy? Vacuous concepts.
I couldn’t really tell whether his point was that all concepts are vacuous per se or whether it’s just the ones that can’t be instantiated he rejects or…I don’t know. I was baffled by that.
Because this part here…
…seems to suggest that because laws serve a practical purpose they are, though conceptual, not vacuous.
However, the same not only could be said of ethics, it would almost have to be if it was also said of law. They’re not entirely conceptually discrete.
I’m not going to get into whether that also applies to religious morality, because murkiness. But a case could be made.
The really confusing thing, though, is that starting here…
…it all of a sudden turns out that it’s the concepts that have legal consequences — or, you know, “laws,” as we call them — that are as vacuous as the wages of sin.
So I guess I would say that zebra just scorns the concept of concepts, generally. But I know that can’t be true. Because:
So apparently the thing that distinguishes a vacuous concept from the good kind is whether zebra is capable of conceiving of it unaided. Which is conceptually problematic, from an ontological perspective.
The whole thing left me befuddled, I don’t mind frankly confessing.
The issue is with “privacy” and “ethics” as concepts– they are just as vacuous as souls and sins. How is “being unethical” different from “sinning”?
Speeches like this really need to be delivered in a laboratory, in the bowels of a castle or an extinct volcano. With lightning crashing overhead.
YOUR PETTY HUMAN ‘LAWS’ and ‘MORALITY’! THEY CANNOT BLOCK THE ADVANCE OF SCIENCE!!
There’s something oddly appealing about zebra’s child-like naïveté around human nature (and human dignity). Yet since he is an adult, a case could be made that it is more like childish ignorance.
I’ll leave you with this and check back tomorrow to see how you do.
Prior to the US Civil War, it was “illegal” to aid an escaped slave–punishable by imprisonment and a hefty fine.
Was it “unethical” or “immoral” to aid an escaped slave?
Prior to the US Civil War, it was “illegal” to aid an escaped slave–punishable by imprisonment and a hefty fine.
Was it “unethical” or “immoral” to aid an escaped slave?
What the everloving f*ck is this even supposed to mean in this context? Z. is trying to be all didactic about the difference between “legal” and “ethical” after just having declared the entire concept of ethical and unethical actions as being equivalent to “sin” in the religious sense, and therefore empty and meaningless? The mind boggles</b at such inanity.
What I'm wondering, actually, is just who Z. imagines his achingly tiresome sh!tfit of a performance is for. Is he trying to “educate” the lurkers, of whom a couple* have delurked specifically to explain just how wrong and ignorant Z. is? Is he trying to “educate” the regulars? Who does he imagine is laughing with him at home?
It all betrays a “See Noevo show” level of idiotic grandiosity.
*who happened to actually know what the f*ck they were talking about, to boot.
^Multiple tag fails are parse-able, I hope.
Another non-sequitur from Z, why am I not surprised?
I get your point that ethics, sin and law are malleable. Although I happen to be one who thinks that certain truths are self-evident, you can go on with your idea because you don’t get to say what my inalienable rights are.
Interestingly enough (or not) a couple of months ago I provided consent and some of my DNA for a study. However, I knew going in what it was for, that it is with one company, and under all the rules and regs regarding study participant’s information. But even I wouldn’t go in on your ‘master’ database. And not because it was your idea. It is because of all the other reasons commenters have raised here.
And, I’ll leave you with this riddle. If you convince the politicians to pass a law for my genetic material for your DB and I refuse to provide it, is that immoral or unethical or just illegal?
Unless you can tell me why we’re suddenly having the kind of deep conversation about ethics, morals and the law that most people leave behind when their YA-fiction-reading years are over, I’m not sure I see how it’s relevant.
I said “not entirely conceptually discrete” not “invariably one and the same, with no possibility of conflict either in theory or in fact ever, ever.”
In case that’s where this is coming from.
I do suddenly wonder whether someone has earned an advanced degree by elaborating the modern history of the use of scare quotes, given that they would have been of obvious value here to distract from the painful mimicry.
“They’re nihilists, Donny, nothing to be afraid of.”
Come on guys, don’t give him too hard of a time. We all know ethicists are just secular theologists and the Universal Declaration of Human Rights is their version of the Bible. When you think about it, isn’t the concept of human rights just as vacuous as transubstantiation?
That’s a good idea.
I sometimes idly wonder why reflexively throwing “teleological” in front of nouns like “perspective” and “argument” has never gotten vogue-ish. I mean, I personally prefer to use the word “f*cking” when I feel the need to preface my nouns with a couple-few empty adjectival syllables. But there’s certainly nothing wrong with “teleological.”
I’m a fool for idle thought, though.
^^That was @Narad, #261.
Better politics. Duh.
I’ve been know to use the T-word.
Not-very-democratic though it might be, you know what Kuwait’s got that we don’t?
Universal health care. That’s what.
Sometimes it’s T-word-ologically justified. That’s not what I meant.
I just read through this thread. Zebra, your ignorance and naivete are astonishing, and not in a good way. You know Jack Freaking Spit about IT. In #223 you say:
Here’s the thing. Substitute your Point 5 for Point 1. The idea of a team of super-hackers breaking systems and stealing data has been wrong for years. It is far faster to bribe or blackmail someone who has access to the database.
As capnkrunch mentioned in #187, the size of such a database combined with the number of researchers who would want access means a huge attack surface. Sooner or later, a determined criminal would locate someone with access who was open to bribery or at risk of blackmail. From there, it would be a simple matter to get the data. in addition, in order for data to be useful, it would have to be in an easily processed format. It would take a fair amount of digging, but eventually one could gain enough to identify (and blackmail) specific individuals.
Not A Troll #259,
Since I am not suggesting a law mandating that anyone, much less everyone, contribute DNA, it would definitely not be illegal.
I can’t very well be asked to apply labels like unethical or immoral since I have stated my position that they are vacuous.
But it would be like not vaccinating your child if you perceived that your child had a negligible risk of contracting the disease in question, and perceived a higher risk from the vaccination itself.
(You would be acting in your perceived self-interest, as all humans do.)
Since in the vaccine case, the perceived negative outcome (physical harm from the vaccine) is of greater negative consequence than the perceived negative outcome from submitting your DNA*, one might argue that you would be acting less rationally than the anti-vaxxer.
But as you say, this kind of decision is “malleable” or subjective. Who am I to judge?
*A perceived harm for which there is zero evidence, unlike the perceived harm from the vaccine.
Well yes, we all know that if we look at reality, but some engage in self-delusion, just like the people who think that they have an immortal soul.
Neither your soul nor your “human rights” mean anything to a firing squad. Your “right to privacy”, which is such a big issue for you apparently, means nothing when you are stripped naked in a cell and a broomstick is shoved up your ass.
But carry on, someone has to man the barricades against the possibility that one’s genome will reveal embarrassing information. And then there’s the threat of Reiki to deal with…
“Who am I to judge?”
That’s an easy one. As a buffoonish, petulant, truculent blatherskite, you’re the best excuse I’ve had in a while to combine some of my favourite words.
So thank you for that.
(a) If that’s the criterion for conceptual vacuity, creating a DNA database for an entire year’s birth cohort is a vacuous idea.
(b) If you believe you have an immortal soul, it doesn’t matter what it means to a firing squad. That’s not the point.
(c) Execution by firing squad is not necessarily a human rights violation. I’m 100 percent opposed to capital punishment in all circumstances. It’s one of the issues I feel most strongly about. And even I don’t think so.
In my experience — which does not include the precise scenario described above, I admit — the reverse is true. Human rights never mean more than they do when they’re being violated. Or when they have been.
I mean, being subjected to sustained, systematic atrocities with no hope of escape or relief for a long period of time sometimes breaks people so that they no longer care. But assuming that there’s a semblance or possibility of human rights in the picture to begin with, what you’re saying is just false.
Abner Louima, whose experience does include something like what you describe, became an activist against police brutality because he survived it.
Not that it’s really material to the instant point, but fwiw, but he evidently also found meaningful aid in his belief in an immortal soul, according to wiki:
A DNA database of an entire year’s birth cohort genuinely does mean nothing when you are stripped naked in a cell and a broomstick is shoved up your ass.
So what’s your point?
What the heck are you even talking about? My right to not be a slave is also a big issue to me and is similarly meaningless at gunpoint. For a guy who loves to complain about non sequiturs you sure make good use of them.
zebra # 272 says that not contributing your DNA for genome studies
“… would be like not vaccinating your child if you perceived that your child had a negligible risk of contracting the disease in question, and perceived a higher risk from the vaccination itself”
and you are right if you believe that
“…the perceived negative outcome (physical harm from the vaccine) is of greater negative consequence than the perceived negative outcome from submitting your DNA”.
This sort of reasoning probably would be justified if the person who decides is the same person who must bring the consequences of a wrong choice. May be this is understandable in a climate of “my son, my property”. But in today’s civilized countries this is no more considered acceptable.
Different ethics, perhaps?
Ann, now you are joining Krebiozen in out-of-context quotes with silly word games and irrelevancies.
I was responding to capnkrunch. The meaning is clear. There is no difference between UDHR and a bible. Neither matters. What matters are the choices of those who have control of what happens, which ultimately means the sovereign entity. “God is on the side of the big battalions” or some such quote. Napoleon?
And I never said people who delude themselves about souls and rights don’t benefit from the delusion; it is exactly my point that feeling better is obviously why they do it. That doesn’t make it less of a delusion.
“This sort of reasoning probably would be justified if the person who decides is the same person who must bring the consequences of a wrong choice.”
I think you meant “bear the consequences.”
But I don’t see how this relates to the reasoning itself. I am just comparing the two negative outcomes. In one case, there is a small chance of a bad reaction having a serious physical effect on the child.
In the other, there is some fantastic scenario (that not one person has been able to describe in even a tv-plot fashion) that leads to some future embarrassment or “loss of privacy”.
In both cases, the parent does make the decision for the child; but again, it is the perceived consequences that are being compared.
Like I said before calling something out of context doesn’t make it so. If that’s how you want to dismiss something you should provide the proper context and explain how the meaning of the quote has changed. It’s not difficult to do, and significantly strengthens your argument. Actually, dismissing something simply by yelling “out of context” or “cherry picking” just makes it seem like additional context actually has no bearing on the meaning and you simply don’t want to address inconvenient quotes.
Also, I have provided several references describing how confidentiality of genetic information can be breached including one actual proof of concept attack. The appropriate way to compare risks is cost/benefit comparisons. Benefits of vaccines far outweigh any risk. Several experts have told you that mass collection of genomes is useless at this time. Even if risk was purely theoretical (it isn’t; deidentification attacks have been proven as I already referenced) because there’s no benefit there’s no reason to take the risk. This is what I was alluding to before when I said you were using the wrong risk assessment tool. Comparing relative risk of storing personal medical records (or vaccinations) to a centralized research database is apples and oranges. When you compare cost/benefit it’s clear your idea carries unnecessary risk while in thethe other two benefit greatly exceeds risk.
I just did that. #278
Saying I didn’t doesn’t make it so.
See also Gish Gallop. The common practice here is to provide incomplete quotes– just phrases– and then riff on some interpretation of some word in the quote, or just start pontificating on something tangential to the meaning of the original comment. The result is that one would spend lots of time trying to refute all those irrelevancies, distracting from the core topic.
Another ploy is drive-by criticism, like what you just did, to create distraction.
“But it would be like not vaccinating your child if you perceived that your child had a negligible risk of contracting the disease in question, and perceived a higher risk from the vaccination itself.”
That’s a remarkably piss poor analogy. With vaccines there is a risk to the individual but also a known benefit to the him/her and society whereas your birth year gernome DB has no demonstrable benefit to a living individual. Sure, you can do something for the advancement of science (and that’s about all I’ll get out my study + 25 dollars), but that is no one else’s business but the individuals.
For one who speaks of what matters are the choices of those who have control of what happens, I don’t understand your mythical belief in no downsides. It isn’t like something such as eugenics never happened or that designer babies aren’t a thing or that the non-productive in society aren’t discriminated against or that even now there are those who speak of an age when people are expendable. I would love to see it (not) when they say those with certain genes are.
There’s better politics and there’s worse politics and neither of us are mind readers to know how that is going to play out. So for those who want to donate their DNA have at it and for those who don’t you can leave them alone. Unlike the unvaccinated who are a risk to others there are no issues with DNA non-donators. If the science gets better then I expect more people to be on board with it but to expect people to do so now is kind of a joke.
I’m sure you meant 277 but that reminds of how much I agree with [email protected] Your nonsense sounds like the kind of half baked ideas I’d expect out of a student who just got an A on their first test in intro to philosophy.
No, you neither provided additional context nor explained how the meaning of the quote changed. ann already used the quote in the context of it being a reply to me otherwise the reference to vacuity doesn’t make any sense. If anything your additional comments make ann’s point even stronger. If all that matters is the choices of those in power than anything that doesn’t directly work towards gaining or maintaining power is a vacuous idea.
# zebra 279
“Bear the consequences”, of course.
My point wasn’t about which is the right choice in this question (in my personal view: to vaccinate and not to provide DNA).
But about who has the right to make the choices.
Giving my own DNA has consequences only for me, so the choice is very personal.
Not to vaccinate involves other people, too (our own children or non immune lay people). So it can be argued that the choice should be made not by parents, but by a public agency.
Speaking of creating distraction I don’t think I’ve yet seen you make an actual response to a single criticism of your idea. It’s always out of context this, non sequitur that, I don’t agree, 20th century thinking, where’s the evidence (ignoring multiple references provided of course). More recently it’s this pomo everything’s the same nonsense. Look, if you don’t think protecting privacy is an important that’s fine. HIPAA says otherwise and you seem to care about enforceable laws. How will your database protect confidential health information when genomes can be personally identifiable?
What you haven’t provided is any plausible scenario in which the project I described has any risk of harm to any individual. That would involve a specific narrative, which could even be quite hypothetical– as I said, like some tv crime or spy plotline. Nothing.
Your interpretation of what “several” experts have said is as irrelevant as your beliefs about what constitute human rights. I’ve answered Retro Pump way back at 112. The disagreement is over the rate at which the data is converted to digital format.
I’ve also pointed out that there are obviously experts who don’t agree that large, centralized databases of genetic information are useless at this time. If you want to argue the specifics of my idea v the Brit 100K genome project, that would be fine, but you’ve already said that you are against any further collection of genetic data. Which isn’t feasible anyway.
OK. But the public agency could also decide that DNA must be given, as in the Kuwait example ann provided.
So, we are back to politics, and the power of the sovereign, whether that is a democratic entity or a king.
I just realized that maybe you arrived late and were not clear on what the actual project was that I suggested. DNA would be collected from all the children born in a specific year, so again it would be a parental decision in both cases.
I got that that was your point.
Mine was that if the criterion by which you decide whether a concept is vacuous or not is whether or not it saves you from a firing squad and/or being stripped naked in a cell and having a broomstick is shoved up your ass, the concept of establishing a DNA database for an entire year’s birth cohort is completely vacuous.
The concept of human rights, on the other hand, provides some protection to most people who live in a society that respects them, as well as some recourse to those whom they don’t protect.
IOW: You’re using what you call a grandiose strawman. Human rights also don’t save people from dying in car accidents. That doesn’t mean they’re completely useless.
I honestly don’t see how that was either out of context or silly or irrelevant. I addressed your entire argument, which was that ethics and human rights are vacuous because if you’re in front of a firing squad or being stripped naked in a cell and having a broomstick is shoved up your ass, they won’t help you.
In the event that my meaning wasn’t clear: While that may be narrowly true if you’re already in front of a firing squad or being stripped naked in a cell and having a broomstick is shoved up your ass, there’s not much about which the same couldn’t be said. And it’s entirely false if your aim is to avoid ending up in that kind of situation. The concept of human rights does more to keep people out of it than anything else, by greatest-good-for-greatest-number standards.
That’s what I said in response to capnk, where his statement is in quotes within the blockquote.
I didn’t say that people’s beliefs didn’t give them comfort.
I didn’t say that the delusional beliefs of other people have no effect on the situation in which the individual finds himself.
So you are indeed the one strawmanning.
No, exactly no. What best keeps people from in front of the firing squad is self-interest. It is the experience of Martin Niemoller. It is when people stop following that lesson, and start using delusional, arbitrary, “beliefs” to guide their society, that problems arise.
You know, “an eye for an eye” pretty much justifies that death penalty you don’t like.
This is ridiculous. Not long ago I pulled up zebra for inaccurately paraphrasing people and putting those inaccurate paraphrases in quotation marks. He was apparently unaware that quotation marks generally denote a direct quote; he even claimed that putting paraphrases in quotes is common practice (it isn’t) and told me to “get with the modern era”.
Now he is complaining about accurate paraphrases of his comments that are not in quotes? Perhaps zebra would like to specify where I, or anyone else, have changed the meaning of his words by quoting him out of context (difficult when the context is right above on the same page), or where I, or anyone else, have made any argument that does not follow from its premises (i.e. a non sequitur). What zebra writes is quite ridiculous enough that no one has to change the meaning of his words to criticize them.
It seems to me that zebra is simply too dim to understand it is possible to paraphrase someone’s words without changing their meaning, and lacks the basic scientific knowledge required to see how some arguments do follow from their premises. Either that or he still hasn’t grasped what “out of context” and “non sequitur” mean.
Silly word games? I plead guilty, though not on this thread, other than a reference to PoMo and a hypothetical example of data that some might like to keep private as tiny-penis-syndrome, neither of which really count as word games.
As for irrelevancies, zebra has treated us to a plethora of those, from an irrelevant Pynchon novel to someone being physically assaulted in a cell, and other bizarre analogies that make no sense whatsoever.
Back to the issue under discussion: I don’t know how large the risks are because I don’t understand the details or scope of the problem well enough, which is why many countries have set up committees of experts to look into this, assess the risks and formulate necessary legislation. I don’t think we should ignore potential risks and just hope they come to nothing. We have already had some tastes of the possible problems we may face; here’s an example (adapted from a real case) of genomic information being used to discriminate against a person:
There are other examples of this kind of discrimination on the same page. Isn’t it best to figure out how to avoid more of this sort of thing in the future before we start sequencing the genomes of millions of people?
Speaking of the 100,000 Genomes Project and one such group of experts who Krebiozen referenced previously:
You and your 20th century punctuation.
See also Gish Gallop. The common practice here is to provide incomplete quotes– just phrases– and then riff on some interpretation of some word in the quote, or just start pontificating on something tangential to the meaning of the original comment.
^ Well, that’s an odd blockquote failure mode.
“Well, that’s an odd blockquote failure mode.”
You just had an errant one that resulted in nested ones. Btw, did you figure out your issue with the winking face emoticon?
Okay, reading this long thread, I’m getting the impression zebra is unaware of exactly how much genetic information a human being has.
RE: JP #151
I’ve been away dealing with real life, but –
I wasn’t proposing that we should have a law to prevent the mentally ill from possessing firearms. We already have a suite of state and federal laws that say so, and it’s my understanding that the Supremes have said they are an acceptable restriction (HDB take note), so if you want to fight it, your argument is with your Representatives, not me (but I do think these laws are just fine).
You also may have the impression that I think all mental illness are the same, but both me and the law are well aware that there are many different types of mental illness, and not all (and maybe not even most) should lead to restrictions of any rights or freedoms.
You link doesn’t link. I agree that most mentally ill people are mostly harmless most of the time, and some are completely harmless all of the time, but I don’t agree that “the mentally ill” are no more violent than “normal” people. According to this http://www.bjs.gov/content/pub/pdf/mhppji.pdf ,
The same source says 61% of the state prison population who had current or past violent offenses had mental problems. This source ( http://www.nimh.nih.gov/health/statistics/prevalence/any-mental-illness-ami-among-adults.shtml ) says the prevalence of any mental illness among adults is 18%, so it would seem that the mentally ill are more likely to be violent than “normal” people. Or maybe the mentally ill are just more likely to get caught.
But none of the was the main point that I trying to make, which seems to have been lost. I was trying to offer a scenario where the existence of a complete medical database could (depending on your point of view) be used or abused. Let me try again.
We have laws that restrict some mentally ill from buying and possessing firearms. However, these laws are mostly toothless, because there is no mechanism for any gun seller to know if the purchaser is amongst those restricted by these laws. Zebra’s metadatabase would have the necessary information to enforce the law. Part of me thinks this would be a good thing to use this information to prevent some portion of violent crimes.
But a larger part of me thinks it would be bad. HIPAA laws make it clear that medical information should only be used to provide medical care. I like this law just fine, too. If we allow an “except for” to come in, we can expect others to follow. The government does have a record of using the data it holds to do bad things (census data and the Japanese internments come to mind). But Zebra seems to think that HIPAA should be darned to heck, and if y’all can’t convince him otherwise, I’m sure I can’t. Y’all have probably made this argument somewhere along the way (I hope to chew thru this in the next day or so) so I apologize for any redundancy.
Zebra may say that this is an argument against UHC. But again, it’s HIPPA that would be our protection, and it seems to me we should likewise protect it. Our medical information should only be used for benefit of the patient, unless the patient gives clear, informed consent otherwise.
“Or maybe the mentally ill are just more likely to get caught.”
In addition it may just be that treatments and social support for mental illness is vastly underfunded and they end up in prisons because they are nuisances and nobody else is taking them in.
Here is my link; with any luck, it’ll work.
“Mental disorder” is awfully broad and probably includes substance abuse disorders. People with mental illness who don’t have a substance disorder are, according to various studies, either no more likely or only slightly more likely to commit violent acts compared to the general population.
We’re talking about SMI here, though, also. There is *one* mental disorder which does correlate with increased violence, and that is antisocial personality disorder (or sociopathy), but that is not considered SMI.
In any case, sex (male), age (youth), and socioeconomic status (poor), correlate much more strongly with violent crime than mental illness does.
^ Substance disorders are considered “mental disorders” even on their own, not concurrent with an “Axis I” disorder. (I think psychiatrists have lately stopped using the Axis model, but you get the idea.)
I forget how this one got started, and I’m on deadline again, but I’ll briefly note that in my state, as I recall the FOID application, one has to have been involuntarily committed to be denied a license. This is an obvious, verifiable standard, but it represents a very small subpopulation, indeed.
No, that’s just you creating a straw man out of what I said by cherry-picking and paraphrasing. As I said:
^^See that right there? Please address it. It’s my main point.
But let it be duly noted that it’s no less true to say…
…that what you did say is. Because that’s my point. As a standard for establishing the vacuity of concepts, that’s a useless construction. All concepts fail it equally.
You’re therefore still one reasonable argument short of demonstrating that ethics and human rights are vacuous.
Please ante up.
FWIW, I don’t disagree that it’s in everybody’s self-interest to see that each and all are equally protected from abuse. But that’s the principle on which the laws and ethical standards you’re objecting to are based. Respect for persons, for example.
Pretty much. But since that particular concept goes back 4000 years and has never (afaik) been absent from any human society on record***, it’s probably not the best example you could choose of the kind of delusional arbitrary belief that leads to trouble when people start being guided by them. In order to know whether that was true, they’d have to stop first.
***Before you tell me that there are societies that don’t practice capital punishment: I know. I said “the concept.” It has numerous other iterations. Too numerous to count, really.
Anyway, back to the topic at hand for a moment, allow me to hearken back to carts and horses.
Has anyone figured out why Z. thinks a birth cohort is some sort of especially interesting sample per se? It can’t be pure size, given that other countries can randomly be plugged into the Master Plan (and given that Z. is demonstrably unclear about the size of the U.S. one).
You’re not going to capture any particularly granular environmental data specific to a generation unless, oh, say, the medical records are also going to geolocate the genome over time as it wanders to and fro, so, why?
The only thing I’m coming up with offhand is normalizing actuarial models.
Or maybe people who are incarcerated are likelier to have been evaluated and diagnosed than people who aren’t.
Or maybe being convicted of crimes that (in many more cases than one would like to think) you may not have committed and then locked up for years has a bad effect on your mental health.
Or maybe being in prison =/= being violent. There are a lot of people in prison for dealing drugs, for example.
Or maybe all of the above.
Except not maybe. JP is entirely correct. The mentally ill are no likelier to be violent than anybody else is.
More things zebra doesn’t get:
Huge government databases are hacked all the time. See the ever-expanding government employee personal information database hack.
If the medical records are linked to the genomes, that information alone is valuable enough to hack. And if any of it were attached to SSN? It could be used to blackmail or influence the parents of a child as well.
Ethics are not optional in human subject research. Not in the US, or the UK, or really anywhere in the world. There are standards and treaties and laws about these things. IRBs are not optional, and they are not optional for a reason.
Zebra: do you know why ethics and human rights must be considered *first* in all human subject research? If you don’t, and can’t find it in a 2-minute search, then you are a buffoon, a liar, or a monster.
And for public health surveillance (at least). I’ve gotta get my ass in gear, but I’ve already mentioned the Vaccine Safety Datalink. I don’t know whether the plan participants of the HMOs who are the data stream are represented on an opt-in, opt-out, or “neither” basis, but it strikes me as real-world example that Z.’s trip should at least be able to make specific contact with.
Not “big picture” enough, though, I suppose, or something.
But how is that my criterion? This is a truly bizarre interpretation, which is perhaps why I think you are strawmanning?
What makes “rights” vacuous is that they cannot be demonstrated, which is why I “equate” them with “souls”.
Sorry if that wasn’t clear enough to begin with.
You only have “rights” in the context of the legal system, (stipulating, of course, that the legal system operates the way it is supposed to.) Your “rights” are granted by the sovereign entity.
If you don’t get this I am happy to discuss it further.
“Well, that’s an odd blockquote failure mode.”
You just had an errant one that resulted in nested ones.
I meant the inversion. It’s neither here nor there; I just wanted to note the error.
^ I’m going to take that additional blockquote fail as a clear signal from the perceived world that this is the wrong time to “multitask.”
IIRC the mentally ill are far more likely to be victims of violence than perpetrators. That said, some years ago, as I have related here before, I had an unfortunate experience with a close friend who had some serious mental problems (delusions) but since he was not presenting a threat to himself or others, and despite my best efforts, I couldn’t get any doctors interested.
Eventually I contacted his mother in Scotland and paid his train fare to spend some time with here, thinking a bit of time out of London would do him some good. Sadly while he was there his condition worsened and he killed his mother, rather horribly and then calmly called the police.
I did visit him in the secure mental hospital he was placed in (the Scots sent him back to London), and kept in touch for a while after he was released. I never felt quite comfortable with him after that, even though he was on depot anti-psychotics and asymptomatic, and have since lost touch.
That has left me with an unease around people with SMI that I know isn’t entirely rational, but it’s still hard to shake off.
^ “some time with her”
That’s terrible; I’m sorry to hear that. I know of a similar situation involving the older brother of a school friend, but I wasn’t nearly as close to what was going on as you were.
The point is that mental illness doesn’t cause violence over and above what you’ll find in the general population; it doesn’t mean the mentally ill aren’t ever violent, just like other people are sometimes violent. Young men have a particular risk of it in general.
As far as the SMI label goes, I mean, I’m pretty sure I’m probably in there. Severe recurrent depression and a nice case of PTSD counts as “serious” or “severe,” I’m pretty sure, even if I’m not having, say, command hallucinations. (Which also don’t incline people to be more violent, just by the by.)
I mean, most of my friends are crazy, too. I guess I sort of have a hard time grokking anybody who isn’t “broken” in some sort of profound way.
I offered to capnkrunch that he could make up even a far-fetched, tv crime or spy show scenario about how all this blackmailing and stuff would happen, but he couldn’t hack it. (pun intended)
I make you the same offer. I’ve watched my share of that kind of thing, and I can’t figure out what dark or deranged imagining is going on in your minds. How would it work? Why would someone go to my database instead of just bribing the person working for the local health insurance company, or the doctor’s office?
So why don’t you like me, phony?
“Hey, what’s up with that zebra, why doesn’t he want to just belong?” (#255)
Or is all your drama just middle-class spoiled brat social-media angst?
Zebra @317: Sure, I’ll invent you a crazy scenario involving a diplomat with an unknown genetic allergy to antibiotics and a terrorist group that wants him dead…
But ONLY after you answer my question about human subject research and ethics.
It can be answered in one word, if you’re lazy.
Because you’re not terribly likable? I generally like people who are nice and smart and know what they don’t know.
Someone who refers to his wife as “perfectly serviceable” does not immediately strike me as friendship material either, just by the by.
Oh, Zony, stop it. My sides, they’re killing me.
All the pieces are there, I shouldn’t have to weave a f*cking narrative for you but here it is anyways. Insurance company (via some lab) requests information for, say diabetes. They do legitimate whatever with it but simulatenously use various methods to reidentify the genomes. This information is then used as a kind of blacklist, people they win’t sign. It’s illegal to deny coverage for pre-existing conditions but now they have access to information that they shouldn’t and mountains of plausible deniability. “We didn’t deny coverage because of the pre-existing condition. There’s no way we could have known,” (recall the data is ostensibly deidentified).
How about in the reverse. Some reserachers suspect they have discovered a mutation that greatly increases risk of leukemia. To help verify they request the histories of everyone with said mutation. Somehow an insurer gets a hold of the information and reidentifies it and hikes up everyone’s rates.
Heck, hackers comprimise healthcare servers (insurers, providers, etc) all the time; I don’t know what they do with the information but it’s certainly nothing good. Despite what you seem to think, I imagine bribing an employee probably puts the attacker at far greater risk than a spearphishing campaign; I don’t recall any recent attacks where bribery was the method of entry. Why give hackers another target? BCBS was too secure? Try the genomics database.
You may have watched you share of TV shows and movies but like Julian Frost noted before you appear to know jack sh!t about security. It’s common knowledge Hollywood rarely does any technology justice. You’re a fool to think that means anything.
In any case, a “TV show scenario” involving private health information isn’t hard to think up at all. We can even use SMI as an example.
What if, as has been brought up several times, scientists figure out how to read the genome for things like schizophrenia? I do in fact know a couple people who have, yes, schizophrenia, and are not “out” about it, either at work/school or with non-close friends. They have every right to keep that information private, given the kind of stigma it carries. I can well imagine that being “outed” would be an awful experience for them on a lot of levels.
Why would somebody use that information against somebody else? I mean, if they’re jerks, and enjoy the idea of making other people miserable and intimidating them out of their money, why wouldn’t they? It’s more information out there, your entire genetic code. Why wouldn’t people go after it, regardless of whether they also go after other health information? Again, we don’t know right now just how much information about a person is in their DNA.
^In any case, I gotta jet. I must bike to the Asian Market (I guess) and find a fruit I haven’t tasted in 9 months before I go to a Rosh Hashanah thing.
I looked for something recent on the subject of violence and mental illness and found what I think is one of the best overviews of the subject that I’ve ever seen here
Krebiozen, I’m sorry that all involved went through that. Truth be told I am also leery of those plagued by psychosis. But it is no more than I would feel with anyone else who I sense is unpredictable and/or acting erratically. That includes the police, business managers and angry people. It is just something instinctual and I doubt I will ever be able to change it.
I started wondering about Plomin, and his confident claims back at the end of the last century in 1993 and again in 1998 that he was about to discover the genes for intelligence, Real Soon Now. He seems to have found a fortunate niche where being wrong (and spending millions of dollars and decades of time) only increases his reputation, and his most recent, largest search for the genes for intelligence admitted total failure in 2010.
So he has sent the genomes to a Chinese colleague to be sequenced safely away from IRBs, and as of 2013 he was promising success within the year.
Of course it wasn’t clear. One of your examples was of a right being violated; and the other didn’t necessarily implicate any rights at all.
Neither illustrated the proposition that your rights can’t be demonstrated. And it’s not true that they can’t. For example:
Are you expressing your beliefs and opinions about the vacuity of rights on this very thread without fear of reprisal from the state?
Well, there you go. You’ve just demonstrated that one of your rights is alive and well.
Yes. You also only have citizenship in the context of the legal system. Ownership of property, too. It’s also what prevents landlords from shutting off the heat in the middle of the winter to save money and the reason why you don’t have to work a seventy-hour week to keep a job.
There are lots and lots of perks and benefits you only have in the context of the legal system. What of it?
In this country, your rights are granted by the (wait for it) Bill of Rights. And the US Constitution. Assuming that “sovereign entity” means “the state,” that’s not really the same thing.
But for the sake of argument, let’s say that it is. So what?
That’s okay. I’m good, actually. The “So what?” and “What of it?” were just rhetorical questions.
Anyway, if the thing that makes rights and ethics vacuous is that they (putatively) can’t be demonstrated, you need to make a case for it. Because you haven’t yet. So let’s just stick to that.
On further reflection, I realized that even if the vulnerability (as it were) used to compromise PHI was bribery, zebra’s data still greatly increases that attack surface. There’s a whole new set of people with acess to confidential data who might be susceptible to bribery. No one at UHC bites? Try the genomics guys. Of course, this bribing the peons discussion is vacuous for a number of reasons, not the least of which is that the idea is actually more Hollywood than elite ninja codebreakers or whatever zebra was going on about.
I think what zebra is going for is that your rights only exist so long as those in power choose to allow them to. I get the feeling his AP History class just finished their section on Hobbes.
In the same sense that your running water does, sure. But you know. Your life only exists so long as someone with the power to kill you chooses to allow you to live. Other people are always around somewhere, having power to affect you some way. Can’t be helped. C’est la vie.
Details — such as how realistically likely it is that those in power will cancel the Bill of Rights, or be able to — really matter therefore.
[…] brief announcement, sure, that I was on vacation, but the rest of my posts were either reruns or a delayed crosspost from my not-so-super-secret other blog. As far as I can tell, that’s unprecedented. Even […]
Don’t quit your day job and try to become a scriptwriter. (But I appreciate that you made the attempt.)
The thing is, I asked you to do that for my project. That would be (let’s say, assuming lots of opting out) a round figure of 200K samples, derived from a particular year’s births.
I may be misinterpreting the specifics of what you suggest, but it sounds like an insurance company bribes a qualified academic researcher to submit a request for a set of genomes to be used in a “cover” project.
That data set would be limited to those individuals in the birth cohort with the specified (“cover”) medical condition, and probably a fraction of those if the number is large. Let’s say 5K to be generous.
So we are talking about all kinds of criminal risk in order to identify the fraction of 5K individuals with condition A who also happen to have the genetic risk factor for condition B? And who might be applying for insurance with that particular company? I just finished logic101, so I think I will call that reductio ad absurdum. (sarcasm)
There just is no way, no matter how many times you repeat “threat surface”, to claim that participating in my project adds anything to the existing risk of harm from unauthorized access to an individual’s health information. It just makes no sense, even if you could access the entire thing, to bother with such a restricted and really trivial amount of information for nefarious purposes.
And even though I don’t think the identification-through-the-genome route is a big problem, the existence of my database would reduce the risk to the general population. That should be obvious.
That’s because you don’t understand the basic issues, which in turn devolve to your wholesale failure to describe in even rudimentary fashion how the Bright Idea is supposed to work in practice, aside from “Computer, give me the entire genomes of all persons with condition X that satisfy this random list of natural-language terms.”
How many times do you have to be told that no real-world, arbitrarily whole-cohort, data bank linked to complete medical records is just going to barf up whole genomes? Why can’t you answer the simple question of what the interface is supposed to look like? With a genome in hand, can one “freely” pull the medical records? There might be something in there somewhere!
It’s not as though you’ve had a shortage of time or lack of genuine input with which to put together something resembling a contentful response, but instead you’re just jabbering incoherently about weird jail fantasies and so on.
I mean, yah, I think I’ve already noted that you’re Gumby, dammit, but WTF? If you’re going to whine about “the core topic” – and if the “core topic” is something other that Z. is so bright you have to wear shades – why are you so enthusiastically running the hell away from it?
“There are lots and lots of perks and benefits you only have in the context of the legal system. What of it?”
“Anyway, if the thing that makes rights and ethics vacuous is that they (putatively) can’t be demonstrated, you need to make a case for it. Because you haven’t yet. So let’s just stick to that.”
“Your life only exists so long as someone with the power to kill you chooses to allow you to live.”
OK, so you answered it yourself, gratis capnkrunch. You can’t demonstrate that you have a “right to life”.
Note to capnkrunch: If I have to end up sounding like philosophy101, perhaps it’s because everyone else sounds like they never took that class, and it is necessary to (re-)state the obvious?
I like to think she married me primarily because of my sense of humor, (as well as compassion and intelligence, of course). Not just that other thing.
Once again, zebra displays his ignorance of both IT and security risks.
Given the size of each DNA sample, we are talking petabytes, if not exabytes, of data. That will need storage space, electricity to run, and people to maintain both the physical hardware and administrate the software. Queries would take a very long time. It’s not yet practical.
That’s one of several possibilities. Another is a gang who blackmails or bribes either a researcher or an administrator into giving them certain data. Remember, in a database that big, the number of users and admins is huge, meaning a HUGE attack surface.
Or the data could be something like who meets the genetic risk criteria for a certain disorder like e.g. schizophrenia.
Except that your project is far less secure than you assume, and that the risk of the project getting hacked is far greater than the risk of someone’s medical files getting exposed and the data used nefariously.
It isn’t obvious. If anything, it is obvious that your proposed database significantly increases the risk to the general population.
If I recall correctly, there have been very few retractions from the GCN Circular.
Sweet G-d, do you still think that the U.S. birth cohort is “385,000“?
Oh, wait, insurance companies aren’t able to employ “qualified academic researchers”? The wholly unaddressed regulatory structure becomes more baroque by the moment.
Seriously? What don’t you get about additional targets equals more exposure? If hackers are interested in stealing PHI from hospitals they will be interested in stealing it from your database.
Regarding insurers, there have been cases of rates being increased due to genetic risk. Certainly the largest database of genomic information would be attractive but it is unlikely. That said, the reason I originally said these narratives don’t matter is that the onus is on you to ensure the information remains confidential, not on others to not abuse it. Regardless of risk, HIPAA says that legally you need to offer protections that our current tehnology cannot provide for a database that size.
If your counterpoint is the 100,000 Genomes Project there’s two problems. First is that privacy concerns are alive and well is that project (see the Nuffield Council link Krebiozen provided earlier). Second is that England has far more lax confidentiality laws.
Or maybe it’s that most people graduate that kind of sophmoric babbling with high school. Don’t misunderstand. You don’t sound like philosphy or whatever 101. You sound like someone part way through the course with just enough knowledge to make yourself look like a fool and not enough to realize why.
Krebiozen doesn’t like me suggesting that people are CUI (whatever the kind of influence) but sometimes it is difficult not to conclude that.
You have sputtered and blustered about the number multiple times, so let’s read the reference you gave: (#44)
I take the rest of your blustering and sputtering to also be the result of whatever it is, and not worth wasting my time.
Julian [email protected]
You’re wasting your time. zebra thinks queries are zero cost actions and that selecting subsets with specific crtiteria is “trivial”. I gave up trying to explain the hardware and software limitations after he hand waved away my references as cherry picked and refused to even look at the sources I provided. Of course he’s done the same for the privacy concerns so it’s I guess I’m just as guilty of time wasting.
I’ll provide by references once more, mostly because I think Julian Frost might find them interesting in case he missed then earlier but also on the off chance that this time zebra will decide to at least make an attempt to educate himself.
Big Data: Astronomical or Genomical?
Routes for breaching and protecting genetic privacy