Financial Daily from THE HINDU group of publications
Monday, Sep 12, 2005

eWorld
Features
Stocks
Port Info
Archives
Google

Group Sites

eWorld - Books
Columns - Books 2 Byte


Some biology for computer scientists

D. Murali

How should computer scientists proceed in computational biology? Simple. Learn as much real biology as possible.

GIVEN a string P called the pattern and a longer string T called the text, the exact matching problem is to find all occurrences, if any, of P in T. Thus begins Dan Gusfield's Algorithms on Strings, Trees, and Sequences, from Cambridge University Press (www.cambridge.org) .

As you know, pattern is what you search for when pressing ctrl+F and typing in what you're looking for, or what you supply Google with as a query to be answered by sifting through billions of Web pages.

If you wonder if the existing search solutions aren't enough, the author points out that the exact matching problem is not yet effectively and universally solved, and so it needs further attention. "It will remain a problem of interest as the size of the databases grows and also because exact matching will continue to be a subtask needed for more complex searches that will be devised." Moreover, "the education one gets from studying exact matching may be crucial for solving less understood problems," says Gusfield.

For instance, it's good to learn that there is `the pre-processing approach' that an algorithm may adopt "to efficiently skip comparisons by first spending `modest' time learning about the internal structure of either the pattern P or the text T." I guess that should work for our work too! Fundamental pre-processing is a phrase to describe an approach that is independent of any particular matching algorithm.

At the end of each chapter are exercises. Such as: "Suppose one is given a DNA string of n nucleotides, but you don't know the correct `reading frame'. That is, you don't know if the correct decomposition of the string into codons begins with the first, second, or third nucleotide of the string. Each such `frameshift' potentially translates into a different amino acid string..."

Read about the `bad character rule', a useful heuristic for mismatches near the right end of P, as the author explains. The rule is `reputed to be highly effective in practice, particularly for English text'. Ah, you knew that! However, `for small alphabets' the rule is less effective, it seems, so there's another rule called `the strong good suffix rule'.

Since the focus of the books is on `computational biology', the author provides many references to that field. For instance, when discussing wild cards, he looks at their occurrence in DNA transcription factor, which is "a protein that binds to specific locations in DNA and regulates, either enhancing or suppressing, the transcription of the DNA into RNA." Zinc Finger is a common transcription factor! "The study of transcription factors has exploded in the past decade. Many transcription factors are now known and can be separated into families characterised by specific substrings containing wild cards," writes Gusfield.

How should computer scientists proceed in computational biology? The author's advice is to first learn as much real biology as possible, through journals, conferences and discussions; and second, not to be limited by "the already formalised computational problems and established models". Gusfield is optimistic that computer science will make the most serious contribution to biology after the emergence of a large community of people who understand both fields. He foresees that a community of biology-educated computer scientists will make the largest impact by proceeding the same way that molecular biology proceeds - "by picking problems and research approaches that best suit available and potential techniques".

If you're ready to jump in, start with Gusfield!

Don't plan less for wireless

ARE you clueless on how to go about putting up a wireless LAN? If yes, turn to Bruce Alexander's 802.11 Wireless Network Site Surveying and Installation, from Cisco (www.ciscopress.com) . WLAN can give you productivity gains, but you can't afford to pay less attention to the details. Because, "selecting the wrong components or the wrong architecture, or installing the WLAN using improper guidelines, can cause you to just as easily end up with a system the users hate because of arcane access and use rules. Or you will have a system that causes the network to become unbearably slow, insecure, or extremely vulnerable."

The number 802.11 refers to IEEE's Working Group responsible for WLAN standards. 802.11 that became a standard in July 1997 defines two RF (radio frequency) technologies operating in 2.4 GHz band, viz. DSSS or direct-sequence spread spectrum and FHSS or frequency-hopping spread spectrum.

Wi-Fi stands for wireless fidelity and is pronounced "why-phy"; and it is the trade name of WFA or Wi-Fi Alliance. Wi-Fi has come to mean WLAN for many users, informs the book. The basic WLAN consists of a device that is attached to the network, an antenna, and a portable device called client. AP is short for access point, "the device that provides access to the network by the remote or portable radio devices". This is also referred to as `wireless gateway' in small networks.

The book discusses elaborately site survey and how to go about the task. It educates that site information should include data on "number of floors, floor construction, ceiling height, ceiling construction, lift availability, plenum ceiling, wall construction, current percentage of stock level, temperature changes, and hazardous areas".

Take care to keep APs at least 10 feet away from any standard microwave oven, advises the author. He cautions about emergency rooms in hospitals that use sensitive equipment such as electrocardiographs and other monitoring systems. "One common problem occurs with older plotters and printers," he notes. "The RF energy can cause slight variations in the print and plotter driver mechanisms, resulting in glitches in the patient printouts."

During site survey, you can find whether radio signals are absorbed, reflected or refracted. For instance, "material that contains a high level of moisture - such as bulk paper and cardboard - absorbs the signal." Often, the material that hinders the signals may be out of your sight, cautions Alexander, and cites as examples, "steel reinforcement in the concrete walls and flooring, certain types of tinting on windows that contain metal properties, and some types of insulations used in walls."

The key features of a WLAN are software upgrade capabilities, rogue AP detection, flexibility, assisted survey and installation tools, self-healing, remote debugging and so on. As the number of APs on your system grows, manual configuration becomes a challenge, points out the author, to emphasise the need for self-healing systems.

Essential read before going wireless!

Tailpiece

"My guruji has given me an e-mantra!"

"How does that sound?"

"Dreem Sreem Hreem Kleem!"

Books2Byte@TheHindu.co.in

Article E-Mail :: Comment :: Syndication :: Printer Friendly Page



Tata Safari Dicor

Stories in this Section
Playing on a new court


Tell me it's cheaper!
Pyjama logic for PlayStation
The action's in India
Networks that connect
Just the right note
Switching networks
Some biology for computer scientists
Cartoon
A good read


The Hindu Group: Home | About Us | Copyright | Archives | Contacts | Subscription
Group Sites: The Hindu | Business Line | The Sportstar | Frontline | The Hindu eBooks | The Hindu Images | Home |

Copyright © 2005, The Hindu Business Line. Republication or redissemination of the contents of this screen are expressly prohibited without the written consent of The Hindu Business Line