<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"><channel><title>Think Gene - Latest Comments in How Much Data is a Human Genome? Not Much.</title><link>http://thinkgene.disqus.com/</link><description>a bio blog about genetics, genomics, and biotechnology</description><language>en</language><lastBuildDate>Sun, 22 Nov 2009 13:29:47 -0000</lastBuildDate><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-23824225</link><description>The reference human genome still contains some unknown portions, so you need to be able to represent at least one possibility in addition to ACGT. But since you were talking probably about the real human genome, not the current unfinished data, that problem wouldn't apply.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">playedonline007</dc:creator><pubDate>Sun, 22 Nov 2009 13:29:47 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-23818581</link><description>As there are 4 bases. The base id could be represented in a 2-bit field for each base in a sequence. So, for storing, A G C T could be stored as 00 01 10 11 respectively and then retrieved and converted to human readable A G C T.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">bridalshowercakeideas</dc:creator><pubDate>Sun, 22 Nov 2009 11:02:03 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-23685598</link><description>As there are 4 bases. The base id could be represented in a 2-bit field for each base in a sequence. So, for storing, A G C T could be stored as 00 01 10 11 respectively and then retrieved and converted to human readable A G C T.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">worldcupshirts</dc:creator><pubDate>Sat, 21 Nov 2009 00:26:31 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-23524189</link><description>To determine incense’s psychoactive effects, the researchers administered incensole acetate to mice. They found that the compound significantly affected areas in brain areas known to be involved in emotions as well as in nerve circuits that are affected by current anxiety and depression drugs.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">modeling auditions</dc:creator><pubDate>Thu, 19 Nov 2009 06:03:32 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-23114877</link><description>The base id could be represented in a 2-bit field for each base in a sequence. So, for storing, A G C T could be stored as 00 01 10 11 respectively and then retrieved and converted to human readable A G C T.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">loanscash</dc:creator><pubDate>Sun, 15 Nov 2009 00:35:05 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-23114840</link><description>The reference human genome still contains some unknown portions, so you need to be able to represent at least one possibility in addition to ACGT. But since you were talking probably about the real human genome, not the current unfinished data, that problem wouldn't apply.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">loanscash</dc:creator><pubDate>Sun, 15 Nov 2009 00:33:34 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-21932846</link><description>There is also evidence that skipping breakfast is now common in the developed world: in the USA, the proportion of adults eating breakfast fell from 86% to 75% between 1965 and 1991.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Cheap_Leaflet_Printing</dc:creator><pubDate>Thu, 05 Nov 2009 07:23:05 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-20940583</link><description>Humans are diploid: they two of each autosome and two sex chromosomes. So this is the size of a reference haploid human genome, not a complete human individual genome, which would be twice as much data. (2 music CDs) Thanks, neandrothal!</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">xmas_gifts</dc:creator><pubDate>Sat, 24 Oct 2009 15:12:15 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-19733941</link><description>As there are 4 bases. The base id could be represented in a 2-bit field for each base in a sequence. So, for storing, A G C T could be stored as 00 01 10 11 respectively and then retrieved and converted to human readable A G C T.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">size_13_shoes</dc:creator><pubDate>Sat, 10 Oct 2009 07:27:50 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-19484364</link><description>I didn't know how "neandrothal" would contain too much data but it seems the time is not far when Human Genome &lt;br&gt;&lt;br&gt;&lt;a href="http://www.walktall.co.uk/footwear-c-26.html" rel="nofollow"&gt;size 13 shoes&lt;/a&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">IftikharTirmizi</dc:creator><pubDate>Thu, 08 Oct 2009 03:50:59 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-16814482</link><description>I beleive you about the beauty of the  fractals your PC creates out of a short equation. But what do you think, would it be possible to write a fractal that would replicate in 3D the face of your friend? Remember, we are speaking about the  organism form here;  a human face does not look like a cauliflower.&lt;br&gt;&lt;br&gt; Fractal pictures are generated through recursive use of an equation; it appears innumerable times in the created picture.&lt;br&gt;In addition to the simple formula, creating such a picture  needs a lot of processing power to apply the formula  n- times. Just  imagine calculating a fractal per hand.&lt;br&gt; The same is true about the compression; in general, greater the compression ratio, more processing power is needed to create the compressed file, or  to reconstruct the original one.&lt;br&gt;&lt;br&gt; Now, chromosomes are beleived to contain all the hereditary information of an organism. They  contain  a very small information quantity to describe the organisms of enormous complexity, and consequently some kind of  data compression  must then be at work here,  to the ratios  like millions to one. Let's allow even fractals as a compression  method for the main forms and topologies of higher mammals, unlikely as it may seem. Whichever the way  such unimaginably high  compressions are to  be achieved, an enormous processing power is necessary to 'read ' the  stored information.&lt;br&gt;&lt;br&gt; Can any such processing power be indentified in the cell,  say  the fertillized egg- cell, or in any cells and tissues in the later embryonal stadia? &lt;br&gt;That means , if we want the fantastic compressions,  or even  fractals as an explanation about the 'missing memory' in the cell, we are confronted with the 'missing processor ' problem :).  Where is it? Wasn't it easier to confess ' We don't know' in the first place?&lt;br&gt;&lt;br&gt;More broadly formulated; does anyone know even in rough outlines how is  the information from the chromosomes being transfered to and  realized in the concrete shapes and forms of the organisms and their sub-structures? An answer describing how proteins are synthesized on the basis of the genes info would not tell me much about what I really ask here.  &lt;br&gt;If the answer is no, what could this idea on the fractals in genetics be but  a vague hypothesis without any causal content? It is not enough to say 'Fractals do it!'. How do they do it? Or how does anything else do it? I do not think anyone could answer these questions today.&lt;br&gt;&lt;br&gt; Early in the 19 century,  to explain the energy source of the sun, it  has been proposed  the sun is a heap of  burning coal. The first idea that could come to one's mind at the time of industrial revolution  was obviously  the ubiquitous coal, powering  its furnaces and steam engines:). The hypothesis has been taken seriously at first, only  to be  abandoned soon afterwards for its obvious inadequacy.  Of course, nothing could have been known at the time about the nuclear processes in the sun, not even  in roughest outlines. The real explanation came more then a hundred years later.&lt;br&gt;&lt;br&gt; I have a feeling that a similarily big chunk of knowledge is missing in the biology of today  for a viable  explanation  of many important aspects of life, including the mechanisms of the transfer of the hereditary information.  We have to do with the coal-heap  explain- it-away theories, instead of an sincere and brave ' We don't know.'</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">slobodan_cekic</dc:creator><pubDate>Thu, 17 Sep 2009 11:00:00 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-16737742</link><description>Well, it is not hard to follow what you explain. It could be sumarized as data compression of the hereditary information.&lt;br&gt;( like in the compressed image:... 255, write the white pixel 1244 times in this line and then gray 110,  86 times...)&lt;br&gt;&lt;br&gt;Data compression is the least that we can expect from the so obviously ingenious nature of living things.  I certainly do not expect the structure of a skin cell to be written down as many times as there are cells ; or mitohondria described x-times, etc. &lt;br&gt;One time is enough. You still have to position the cells precisely along the lines of the fingerprints, and i do not want to even mention the brain. The exact position of  a single hair -  2 mm right or left might be of low priority to  the organism, but the way the nerve cells are connected certainly isn't.&lt;br&gt;&lt;br&gt;Tissues and cells have spatial relative positions and shapes. With all the compression the hereditary information is expectedly subjected to ( however and wherever stored), the size of the 'file' must still be enormous. &lt;br&gt;&lt;br&gt;Human brain only is said to have 100 billion (10E11) cells, and  a multiple of that number in dendrites that realize the complex brain circuitry through synapses with other cells. Even if we take into account the certain existence of 'typical' circuits, amount of information needed to describe the brain  remains mind-boggling.&lt;br&gt;&lt;br&gt; Even on the cell level, numerous cell types have very complex internal life with very intricate and ingenious chemical internal regulation and metabolism. This  exsists in a scaled up form  on the tissue, organ and organism level, too.&lt;br&gt;&lt;br&gt; It should strain any informed credulty a bit,  that even the structure and functioning of the cell types in the human organism can be described with 740 MB, with the best compression methods thinkable.&lt;br&gt;&lt;br&gt; You probably mean  fractals when you mention generating images with simple algorithms. Nature certainly uses fractal-similar shapes ( Broccoli, flowers, etc.) where it suits the function;(The nature uses simply everything:) but try describing the wing profile of a bird or brain circuitry, for that matter , with a fractal. The exact topology and shape  of the last two are crucial to the function and cannot be left to the will of the wisp fractal - that is why you can not recognize it,  the innumerable  recursions of a simple form, in the design of a , say, human skull. We would end up in everything else  but in the  simple fractal formulas, trying to describe it mathematically.&lt;br&gt;&lt;br&gt; Why dont we start with something simpler; say fractalizing  the shape of  a ship's hull, or compressing a song recording to a fractal, before trying it on the forms of the higer organisms?&lt;br&gt;&lt;br&gt; The genetics has a problem here, and a big one , too: the location of the greatest part of the hereditary information is not known.&lt;br&gt; Mentioning fractals to explain this looks akin to me to looking for a wonder. What can not be explained is mysterious; rationalism abhors the mysterious and any suggestion that things unknown may exist. Interestingly and typicaly for our  age of reason, you search for the solution in the field we know something about- the fractals.  Indeed, is there anything that we, Descartes grandchildren do not know?&lt;br&gt;&lt;br&gt; Wouldn't it be simpler to say 'We do not know', accepting that the answer may lay in the mysterious  realm of the unknown?&lt;br&gt; Socrates would have liked that answer better, I am sure.&lt;br&gt;Denying the ignorance, one never starts searching for the answer.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">slobodan_cekic</dc:creator><pubDate>Wed, 16 Sep 2009 12:58:48 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-14771800</link><description>yes, some 750 megabyte for the haploid sequence is about right. To get the diploid sequence, you only need the "diff" file of SNPs, which will amount to a couple of megabyte, so 800 megabyte for the diploid genome would sound about right. &lt;br&gt;&lt;br&gt;As for your claim of compressing this down to 10 megabytes, this is completely unrealistic. Here is a 2008 paper estimating the entropy rate of the human genome:&lt;br&gt;&lt;a href="http://www.biomedcentral.com/1471-2164/9/509" rel="nofollow"&gt;http://www.biomedcentral.com/1471-2164/9/509&lt;/a&gt;&lt;br&gt;they come up with about 1.8 bits per base pair, which would mean that even with an optimal compression algorithm, the best you can hope for will be a compression by the order of 10%.&lt;br&gt;&lt;br&gt;Unless, of course, your compression "algorithm" itself contains 750 megabytes of data, and will only write out the differences of your genome to some reference genome. In this case, you can hope for "compression" by 99.5%, or down to a couple of megabytes. But this isn't "compression", it is transfer of information from the "data file" to the "program file".&lt;br&gt;&lt;br&gt;If you think that 800 Mb is "not much", well sure, you can store your genome on your ipod nano. Your body, however, stores it in each cell nucleus. This is data storage at the molecular level, far beyond the reach of our current technology. And then the information doesn't just sit there, but is being actively processed within the cell nucleus, a structure of the size of a few micrometers. This is beyond any realistic scope of human-made nanotechnology and will remain so for many years.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">dab</dc:creator><pubDate>Thu, 13 Aug 2009 07:24:31 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-10729623</link><description>Because it's patterns. It's 1 gene per cell, it's instructions that may say "keep making these until chemical_gradient_$c-32 falls below threshold x and then stylize them based on concentration of chemical_gradient_^f-03"&lt;br&gt;&lt;br&gt;Well, maybe that's a bit hard to follow so let's try this instead: how many hairs are on your arm? Well, I don't really care about the particular number but what I want to know is if you had the same number of hairs on your arm when you were a child, and I mean the three foot tall variety.&lt;br&gt;&lt;br&gt;No, no you didn't. You had many fewer BUT they were about the same distance apart. Now, I'm sure you know that your arms don't just grow at the ends- there's a lot of growth in the middle and it's more or less continuous... but how could you add new hairs evenly spaced in that? &lt;br&gt;&lt;br&gt;Well it's simple. Much like our DNA you just need two values to keep track of it (though it's not really bits, it's not THAT simple.) You need a protein that causes hairs to grow and you need a protein that prevents them from growing. Like a lot of things in our body the protein that prevents hair from growing just stops cells from making the hair that promotes hair formation but the promoting protein promotes the preventer and promotes itself. There's another trick though. The preventer moves around between cells much more easily than the promoter.&lt;br&gt;&lt;br&gt;No need to do mental gymnastics here, I'll just state the end result: cells in high concentration of the promoter make enough of it to overcome the effects of the preventer and low concentrations just pool up on the preventer... up to a point. If there aren't any hairs close enough to prevent another from growing they don't have enough of the preventer so the promoter takes over and gives you another hair.&lt;br&gt;&lt;br&gt;A similar set up is also used to make sure you don't grow two heads. In fact this kind of thing is used so often that we can safely say the information used to build your body is many many times smaller than the actual information it would take to record the current state of your body.&lt;br&gt;&lt;br&gt;If you're much of a programmer you know how just a few lines of code (file might end up being a few kb if you didn't want it really small,) could produce an image of many gigabytes in size, if you had some reason to let it make a large enough image.&lt;br&gt;&lt;br&gt;Don't get me wrong though. There is more to us than our DNA. &lt;br&gt;Our DNA basically lays out the boundaries of what we can possibly grow to be and the environment we grow in narrows it down until we reach that single possibility that is ultimate "you."</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Shoku</dc:creator><pubDate>Thu, 11 Jun 2009 01:25:19 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-10729085</link><description>How DARE you patent it.  You should share that with the world freely as a show of good faith.  To do otherwise is reprehensible.  Patent and copyright are the bane of our legal system at the moment.  They stifle community and promote individualistic gain at the expense of the greater good of the community.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">guitarMan666</dc:creator><pubDate>Thu, 11 Jun 2009 00:47:30 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-10696684</link><description>Neandrothal is right, but if you know half the code can't you get the second half because G attaches to C and A attaches to T?</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">drshows</dc:creator><pubDate>Wed, 10 Jun 2009 10:08:50 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-8223798</link><description>My Pc only needs a tiny little fractal program to generate a fractal world of incredible complexity and beauty (e.g. the Mandlebrot set).  Two identical program runs will produce identical outputs, unless I introduce a bit of noise.  The earlier the noise, the wider the divergence. Hence identical twins have differing fingerprints.&lt;br&gt;&lt;br&gt;So I can believe the small numbers quoted.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">repworth</dc:creator><pubDate>Wed, 15 Apr 2009 07:32:02 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-2464591</link><description>I am trying to draw your attention to this: The human, just like any other organism has its qualties determined, and their description then must reside somewhere. The amount of information needed to describe a human organism is enormous, the information amount carried by the genes very limited in comparison.&lt;br&gt; Now, let's take a look at this possible analogy.&lt;br&gt; Imagine you are demonstrating a PC to someone who has no idea of computers whatsoever, and has never seen  one.( Increasingly difficult to find, but there must  still be  some around :)&lt;br&gt;  Ok , you show him how inputs on the keyboard produce results on the screen. Knownig nothing about the PC under the desk, our computer novice has to think that the keyboard alone causes all the fascinating happenings on the screen.&lt;br&gt; Now our virtuous genetics has got hold of the keyboard - genes; making changes there changes the organism. But how for God's sake does it follow that all the hereditary information resides there, and nor on some 'HD' somewhere, away from the 'keyboard'?&lt;br&gt; I am simply pointing out that the 'keyboard' has practically no data storage capacity for the task.&lt;br&gt;&lt;br&gt;'We believe that most of our hereditary information resides in genes because it does. '&lt;br&gt;&lt;br&gt;Oh, pardon the  heresy involved, but I really don't know how do you know that.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Slobodan Cekic</dc:creator><pubDate>Sun, 31 Aug 2008 20:41:49 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-2464590</link><description>740MB is the size of a human haploid nucleotide base string, not the data necessary to describe a mature human.&lt;br&gt;&lt;br&gt;We believe that most of our hereditary information resides in genes because it does. However, a genome, as you say, cannot possibly fully describe a mature human. A genome is more like a brief mathematical equation used to produce beautifully complex fractal design when fed with ambient noise and interpreted as colors and coordinates on a screen.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">drewyates</dc:creator><pubDate>Sun, 31 Aug 2008 14:50:41 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-2464589</link><description>Now, please do take a look at your fingertips. You ll see the fine lines of your fingerprint pattern. It is unique, and can be used to indentify a human; so fine and even much finer structures are defined in your organism.&lt;br&gt;Now, how high would be only 3D positional information content needed to describe a human?&lt;br&gt; You would need to position single cells, define the inner structure of particular cell types, describe the form of single nerve cells (dendrites)...etc&lt;br&gt;  Now how many cells are there in the human organism?&lt;br&gt; Wihout any calculation, we can see the information quantity to describe a human in uncounted Terrabytes. Human chromosomes contain , as calculated here, 740 MB.&lt;br&gt; So, why for the God's sake do we beleive that the whole of our hereditary information resides in the genes?</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Slobodan Cekic</dc:creator><pubDate>Sun, 31 Aug 2008 14:25:15 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-2464592</link><description>For storing the methylation information you will need another bit for each base. This makes it 3 CDs :-)</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">c.lup</dc:creator><pubDate>Mon, 04 Aug 2008 12:44:23 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-2464587</link><description>In truth, human genomes can be more complex than even diploid (think CNV). This is especially true for cancer genomes. You may also want to capture more than one genomes in the reference, e.g., you may want to include the variations in dbSNP in your reference.  To include just SNPs, you could expand your four-letter alphabet to include all the IUPAC DNA codes. Including indels would be even more complicated. Here are some &lt;a href="http://www.politigenomics.com/2008/06/when-stars-align.html" rel="nofollow"&gt;thoughts on how to represent a complex reference genome&lt;/a&gt;.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">David Dooling</dc:creator><pubDate>Tue, 01 Jul 2008 11:24:04 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-2464586</link><description>Thomas: I'd say that most bioinformatics scripts and programs operate on ASCII data, bit-packing data before is rather the exception for the few hard-core tools like BLAT/BLAST, etc. Most everyday scripts still parse fasta to strings and operate on them, I'd say.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Maximilian Haeussler</dc:creator><pubDate>Tue, 01 Jul 2008 04:57:33 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-2464581</link><description>As for the hope of crunching 770MB down to 700MB, it should be noted that programs like gzip rarely get better compression than 2 bit per nucleotide[1].&lt;br&gt;&lt;br&gt;Also, FASTA and other file formats are not primarily used for storage but transport. Almost no bioinformatics program operates directly on ASCII-data, but transforms such exchange formats to some internal representation.&lt;br&gt;&lt;br&gt;For the 10MB I guess the author thinks in terms of  working on a diff with respect to some reference genome. While that is probably workable for applications on the human genome, it's not really patentable (UNIX patch and diff being older than me and there's probably even older prior art) and impracticable on a general scale. Impracticable because an index for describing any sequence in such a relative way would be far too big, i.e. it would probably require more storage than only transferring the sequences worked on directly.&lt;br&gt;&lt;br&gt;[1]</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Thomas Jahns</dc:creator><pubDate>Tue, 01 Jul 2008 02:47:33 -0000</pubDate></item><item><title>Re: How Much Data is a Human Genome? Not Much.</title><link>http://www.thinkgene.com/how-much-data-is-a-human-genome/#comment-2464585</link><description>Ed, you bring up a very good point about methylation and other proteins on the DNA. If you only care about sequence, these things don't matter, but they definitely influence which genes are active or repressed, and even how active a gene is. I suppose it depends on what you're storing the data for.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Josh Hill</dc:creator><pubDate>Fri, 27 Jun 2008 02:13:00 -0000</pubDate></item></channel></rss>