Andrey Vashchenko

Parse XML to RoomDB objects

There were some problems while working on one project.
The server is 1C system. Exchange interface is SOAP.
Andrey Vashchenko
Android Developer
The first issue is that SOAP same as XML but it has some extensions. It had to be removed to work with clear XML.
<!-- before clearing -->

<soap:Envelope xmlns:soap="">
        <m:Response xmlns:m="">
            <m:return xmlns:xs=""
                <m:name>Название объекта</m:name>
                <m:error>Кол-во ошибок</m:error>
                <m:objects_qty>Кол-во объектов в ответе</m:objects_qty>
                <m:information version="3"
                       <!-- next object properties -->
                <!-- next objects -->

<!-- after clearing -->

	<name>Название объекта</name>
	<error>Кол-во ошибок</error>
	<objects_qty>Кол-во объектов в ответе</objects_qty>
	<information version="3" is_deleted="false" hide="false">
		<!-- next object properties -->
	<!-- next objects -->
The second issue is about heavy objects. Besides ordinary data (lines, numbers, dates, boolean values, etc.) I've got large images in base64 format. Parsing itself is difficult task for mobile process. And Increased amount of data leads to increasing of memory consumption and processing time. If I got a large object the app fell because of OutOfMemoryError, and if I didn't the app performed the operation a long time.
The third problem is in comfortable usage of models. The data had to be saved offline. We used SQLite database for saving and the Room library as wrapper. The work with base is going by models. And as XML parser works with models too I decided not to make two models for one object and unite it in one.
Problem solving
For work with XML we used SimpleXML library. It has DOM-parser (manual) and Pull-parser (automated).

The advantage of Pull that it gets XML and class describing model on input. On output we get complete object. The disadvantage of Pull is that while parser is seeking for required field in XML it will iterate over already processed fields. This significantly affects performance.

The DOM-parser advantage overlaps Pull disadvantage: the XML traffic is controlled by developer and parses doesn't iterate over already processed fields but you have to assign fields manually. This parser works faster several times.

Both parsers algorithms are presented below. You don't need to take it as tutorial. The main goal is to present the principle. Each task requires its own code.
Pull-parser work algorithm
DOM-parser work algorithm
As we can see from algorithms the Dom-parser has extra inserted cycle. It slows the parsing process. But DOM-parser can be set faster, as I said earlier, and if objects are small use it.

From the information below I decided to use both parsers. I parsed light objects automatically and heavy manually. It made parsing time faster.

I counted the difference between start parsing time and getting complete objects and I got parsing time in milliseconds for both parsers. The result is on screens.
The main app screen with parser type choosing
Data processing time by Pull parser
Data processing time by DOM parser
For reducing models quantity twice I merged XML and Room models. The screens below show you separated models and merged one.
The advantage of merged model is not only in classes quantity. The main advantage is that you don't need to convert XML to Room to save the object into database. All that you have from parser you can insert to database. And if you need to get XML from object it's easy with merged model.
Room model of User object
XML model for User object
Merged model of User object
Summing Up
As a result I got experience in SimpleXML and Room libraries, checked two kinds of XML parsing, learned about its pros and cons. And the merged model idea was good. So if I work on the same kind of projects I'll spend less time for parser.
Thanks for reading!