![]() Financial Daily from THE HINDU group of publications Monday, Jul 04, 2005 |
|
|
|
|
|
eWorld
-
Books Columns - Books 2 Byte Stored data is doubling every nine months D. Murali
KNOWLEDGE discovery and data mining or KDD has created a sharp discontinuity in economic affairs and opened up opportunities for business, writes Ramasamy Uthurusamy in his foreword to Data Mining: Next Generation Challenges and Future Directions, from The MIT Press (http://mitpress.mit.edu). Stored data is doubling every nine months, growing twice as fast as Moore's Law, is an alarming fact you'd gather, because there are innovative technologies even as cost is rapidly declining. The `glut of data' has been fuelling the demand for KDD tools. And to help the miners, the editors of the book Hillol Kargupta, Anupam Joshi, Krishnamoorthy Sivakumar, and Yelena Yesha have put together erudite pieces on a variety of topics in this area. They note in their preface that over the last decade, data mining technology has impacted many domains, "from credit-card fraud detection and customer relations management to drought modelling and intrusion detection". The book has four sections, viz. pervasive, distributed, and stream data mining; counterterrorism, privacy, and data mining; scientific data mining; and web, semantics, and data mining. Chapter 1 is titled `Existential pleasures of distributed data mining'; it discusses DDM algorithms, and you'd learn that PADMA system is a `distributed clustering-based system for document analysis from homogeneous data sites. Chapter 2 talks about the `intelligence' work for the US Government, and the Himalaya Data Mining Project at Cornell University. Next is a chapter on "the problem of combining multiple partitionings of a set of objects without accessing the original features or internals of individual clustering algorithms". Encounter the knowledge grid in chapter 4, and `photonic data services, integrating data, network and path services' in chapter 5, where you'd read about how "a bandwidth-demanding application can request an optical connection between the data sources and the data sinks for a specific application". Recent emerging applications, such as network traffic analysis, web click stream mining, power consumption measurement, sensor network data analysis, and dynamic tracing of stock fluctuation, call for study of a new kind of data called `stream data', informs chapter 6 on `multiple time granularities'. Read about `epsilon approximation' or EA in the next chapter; "the EA algorithm repeatedly and deterministically halves the data to obtain the final sample". Chapter 8 looks at ways to find `semantic structures in videos' and suggests `a novel unsupervised approach'. Part two begins with Bhavani Thuraisingham's piece on `data mining for counterterrorism', which emphasises the need for multilingual data mining. This is followed by a chapter on `biosurveillance and outbreak detection', with a sobering thought: "The state of the art for detecting abnormalities in surveillance data is primitive. Analyses are typically manual." Meet MINDS or Minnesota Intrusion Detection System in Chapter 11 where you can hear the SNORT! I've mined through less than half the book; and that too on just the surface. But, if you're mindful about security, this is a collection not to be left unturned. Proven solutions to recurring problems
EACH pattern is a three-part rule, which expresses a relation between a certain context, a certain system of forces which occurs repeatedly in that context, and a certain software configuration which allows these forces to resolve themselves. Citing this definition of pattern by Jim Coplien, the preface to Remoting Patterns, from Wiley Dreamtech (www.wileydreamtech.com) , elaborates on what a pattern is. The authors Markus Völter, Michael Kircher and Uwe Zdun explain that patterns are never new ideas; they are proven solutions to recurring problems. "So known uses for a pattern must always exist. A good rule of thumb is that something that does not have at least three known uses is not a pattern." Thus, rather than inventing patterns from the scratch, one discovers them in, and then extracts them from real-life systems. "To find patterns in software systems, the pattern author has to abstract the problem/ solution pair from the concrete instances found in the systems at hand. Abstracting the pattern while preserving comprehensibility and practicality is the major challenge of pattern writing." Do you know that the notion of patterns in software builds on a 1977 book titled A Pattern Language - Towns, Buildings, Construction that Christopher Alexander wrote? There he described "patterns that guide the creation of space for people to live." The book on hand discusses both architectural and design patterns that can serve as `foundations of enterprise, Internet and real-time distributed object middleware'. The chapter on `pattern language' introduces one to a `broker', which hides and mediates all communication between the objects or components of a system. "A `broker' consists of a client-side `requestor' to construct and forward invocations, as well as a server-side `invoker' that is responsible for invoking the operations of the target remote object." Before you invoke your favourite deity to comprehend all this high-funda stuff, let me add that "a `marshaller' on each side of the communication path handles the transformation of requests and replies from programming-language native data types into a byte array that can be sent over the transmission medium." In a chapter on `advanced lifecycle management' there is `the last hope': `Sponsors' are the last hope for a remote object if its lease expires, states the book. "The lease manager contacts the sponsor and asks it whether it should renew the lease for the instance that is ready to die... The respective instance would have to die immediately if the sponsor returns TimeSpan.Zero." Art is the imposing of a pattern on experience, and our aesthetic enjoyment is recognition of the pattern, said Alfred North Whitehead who wrote the three-volume work Principia Mathematica with his former student, Bertrand Russell, to define the logical foundation of science and mathematics, as I glean from www.answers.com. Software patterns too may qualify as works of art when we are able to then aesthetically enjoy what are otherwise only endless lines of code. Useful read for system designers. Tailpiece "Doctor, I'm often confused between the TV remote and the mobile phone." "Happens. I too get mixed up, especially with my post mortem and surgery candidates."
Article E-Mail :: Comment :: Syndication :: Printer Friendly Page
|
Stories in this Section |
|
The Hindu Group: Home | About Us | Copyright | Archives | Contacts | Subscription Group Sites: The Hindu | Business Line | The Sportstar | Frontline | The Hindu eBooks | The Hindu Images | Home |
Copyright © 2005, The
Hindu Business Line. Republication or redissemination of the contents of
this screen are expressly prohibited without the written consent of
The Hindu Business Line
|