XML Parser and Convertor

Project-XML

Introduction:

Our project is basically a parser and converter not a compiler.

XML - Extensible Markup Language, Which is basically used in dynamic web pages to send data from one active page to other.

A markup language is a set of annotations to text that describe how it is to be structured, laid out, or formatted. Markup languages might be manuscript form (often marks among or alongside text describing required formatting or binding), or they might be markup codes used in computer typesetting and word-processing systems. The former are also commonly used to describe the required layout of papers, articles, standards, or books. The latter tend more to be used to instantiate a particular document and nowadays are not generally used directly by authors.

In our attempt we tried to deal with converting a XML code in to a database table after checking almost all the basic XML rules.

Project stages with test cases:

Various important notable stages in our project are:

  1. Taking Input file
  2. Lexical analysis and Parser
  3. Separating tag field and info field
  4. Connecting database
  5. Inserting in the database

1. Taking Input file:

In this phase we take a file from any location in the computer, and then it checks the extension to be either .xml or .XML else it would show a error

The following shows the selection of a .bmp extension file.

XML Parser and Converter

Error Display

XML Parser and Converter Error Display

2 . Lexical Analysis and Parser:

In this phase since in our project <, >, </ these two are the only delimiters. Using this “>” we divide the file into an array of intermediate strings. Next we use “</” to get the Tag fields and Information fields. Here we also check the Tag rules, No of brackets and whether there is an ending tag for a existing tag or not. By doing all this process we will be left with stacks containing tag fields and information fields.

In this phase we also check the count of a flag of precedence of ‘<’ to ‘>’ to distinguish between the database name and table name.

Taking a XML file as a input:

XML Parser

3.Error Detection:

Various kinds of errors in our project

1. No of opening and closing brackets are equal.

xml error

In case of any extra brackets i.e no of opening brackets not equal to closing brackets.

xml error types

2. If any tag misses.

xml error

3. If any error in Tags.

xml missing tags

 

4.Generating Database:

We used XAMPP to connect SQL Database. In this phase we create a database and create table dynamically according to the no of attributes or parameters in a field of a XML file, and insert these data fields . The following portrays the created database for this given XML code.

<nikhil>
<note>
<too>Tove</too>
<fromm>Jani</fromm>
<heading>Reminder</heading>
<body>Forget</body>
</note>
</nikhil> 

Here in this case Nikhil is database and note is the Table name.

XML with xampp

5.Optional output:

If we want we can glance at all various field name separated in a report manner.

xml parser application

 

6. Work Done:

In our project we thought of converting any given XML file containing data into a database. It is an completely Dynamic process as XML code can contain various Databases names and tables and again each table may contain various no of fields.

Till now we consider a single XML file which can contain various entities into different tables in a same database.

Example:   

<nikhil>
<note>
<too>Tove</too>
<fromm>Jani</fromm>
<heading>Reminder</heading>
<body>Forget</body>
</note>
<House>
<doorno>123</doorno>
<loc>shipra<loc>
</House>
</nikhil>
 

For this given code we create a database name: Nikhil 

Table name: note 

Another table name: House

Drawbacks:

Since its dynamic process its very difficult to remove redundancy.

Future Work:

  1. This project can be extended for other markup languages.
  2. We can remove redundancy by string matching the table names inside the database with the  tag names. 
    Example : If suppose a table name student exists in database and if a tag name student1 is tackled* then instead of creating a new table we can update the already existing student table.
  3. In any page XML code is a background so we can search the XML tags and then start our process.

* - we consider fields of student and student1 are same