Sunday, July 25, 2004
Friday, July 23, 2004
[emerging tech] "Web Engineering: The Evolution of New Technologies" & the Ultimate Killer App
- Should the IT agenda include investment in outsourcing technologies or services?
- Does the future of the business include operations in, or electronic trade with, additional countries - China, for example?
- Are the services of an outside provider being considered to help in managing proliferating applications or complex "interenterprise" business relationships?
- What role will utility computing play in the future of IT?
Monday, July 19, 2004
[emerging tech] "Managing XML Data" (Web Engineering: The Evolution of New Technologies)
A diverse set of factors has fueled the explosion of interest in XML ( http://www.w3.org/TR/REC-xml ): XML's self-describing nature makes it more amenable for use in loosely coupled data-exchange systems, and the flexible semistructured data model behind it makes it natural as a format for integrating data from various sources.
But much of its success stems from the existence of standard languages for each aspect of XML processing and the rapid emergence of tools for manipulating XML. Popular tools include parsers such as Xerces ( http://xml.apache.org/xerces-j ), query processors such as Galax ( http://db.bell-labs.com/galax ), and transformation tools such as Xalan ( http://xml.apache.org/xalan-j ). The development of this standards framework has made XML dialects powerful vehicles for standardization in communities that exchange data.
In this article, we discuss the main problems involved in managing XML data. Our objective is to clarify potential issues that must be considered when building XML-based applications---in particular, XML solutions' benefits as well as possible pitfalls. Our intent is not to give an exhaustive review of XML data-management (XDM) literature, XML standards, or a detailed study of commercial products. Instead, we aim to provide an overview of a representative subset to illustrate how some XDM problems are addressed.
Because data typically is stored in non-XML database systems, applications must publish data in XML for exchange purposes. When a target application receives XML data, it can remap and store it in internal data structures or a target database system. Applications can also access an XML document either through APIs such as the Document Object Model (DOM; http://www.w3.org/DOM ) or query languages. The applications can directly access the document in native format or, with conversion, from a network stream or non-XML database format.
In contrast with relational database management systems (RDBMSs) that had a clear initial motivation in supporting online transaction processing (OLTP) scenarios, XML applications' requirements vary widely. Applications must deal with several different kinds of queries (structured and keyword-based) in different scenarios (with or without transaction support, over stored or streaming data), as well as data with varying characteristics (ordered and unordered, with or without a schema).
Commercial database vendors have also shown significant interest in XDM---support for XML data is present in most RDBMSs. Examples include IBM's DB2 XML Extender ( http://www4.ibm.com/software/data/db2/extenders/xmlext.html ), Microsoft's support for XML ( http://msdn.microsoft.com/sqlxml/ ), and Oracle's XML DB ( http://otn.oracle.com/tech/xml/xmldb/ ).
In XML, common querying tasks include filtering and selecting values, merging and integrating values from multiple documents, and transforming XML documents. While XML has enabled the creation of standard data formats within industries and communities, adoption of these standards has led to an enormous and immediate problem of exporting data available in legacy formats to meet newly created standard schemata. Several publishing languages have been proposed to specify XML views over the legacy data---that is, how to map legacy data (such as tables) into a predefined XML format.
In this section, we discuss limitations of existing solutions as well as some open problems. Our discussion is biased toward problems we have encountered in trying to create effective and scalable XDM solutions; it is by no means exhaustive.
Parsing and validating a document against an XML Schema or DTD are CPU-intensive tasks that can be a major bottleneck in XML management. A recent study of XML parsing and validation performance indicates that response times and transaction rates over XML data cannot be achieved without significant improvements in XML parsing technology. It suggests enhancements such as using parallel processing techniques and preparsed binary XML formats as well as better support for incremental parsing and validation.
By using XML-specific compression techniques, tools such as XMill compare favorably against several generic compressors. Compression techniques have also been proposed that support direct querying over the compressed data, which besides saving space, also improve query processing times.
The ability to support updates is becoming increasingly important as XML evolves into a universal data representation format. Although proposals for defining and implementing updates have emerged, a standard has yet to be defined for an update language.
Three figures & sample code; 23 references.
To request a copy of this article click on: http://tinyurl.com/6kcqw .
[humor] The Mind of an American Programmer (courtesy of Sun Microsystems)
Saturday, July 17, 2004
[news] "Urling" Instead of "Blogging"
Friday, July 09, 2004
[news] A Special Report on Business Intelligence
- Gartner expects the market to accelerate in 2004.
- The ETL (extraction, transformation, and load) market will flatten (finally).
- CPM is hot. "Hyperion, Cognos, and SAS appeared to be the best positioned non-ERP vendors to capitalize on the CPM market opportunity." However, "(they) believe that SAP is the best-positioned large enterprise software vendor to execute in both the BI and CPM market ..."
- Finally, the Gartner BI conference itself was hot, with 973 attendees, an increase in attendance of 70% over last year.