Polstergeist Help

 

Submitting a Request

On the Browser Application, press the Services Tab  and select any services for which you wish to extract data. Then press the Start Button to create a new request.  That's probably all you need to know as the browser will walk you through the remaining steps. If you get stuck or want more information you can always return to this page by pressing the Help Tab.

If any of your services require input parameters (e.g. DepartureAirportCode) a data entry form will pop up. After entering the input values press the Continue button on the form.

You will now find yourself positioned on the Requests Tab. You can change the automatically generated name of the request to something more meaningful if you wish  (e.g. My Flight to Phoenix) and then press the Submit Button.

The Polstergeist Extractor window will  open allowing you to watch as data gets extracted from each web site in your request.  On occassion a web site might not respond as expected (e.g. an airline which does not fly to a particular destination). In this case the extractor will automatically time out after 60 seconds of inactivity is detected and record a message of "Data Not Available" for the service in question. You may also press the Cancel Remaining Items button at any time to cancel any extraction jobs which appear to have stalled. The Extractor will assign a message of "Canceled by User" to any service item that was still being processed when you pressed the Cancel button.

Once the extraction process has been completed  the Extractor window will close and a message box will appear stating that the report has been completed. Select OK to view the report.

 

Trouble Shooting Tips

If the polstergeist application is not working it could be for one of the following reasons.

 

Securing Your Data

All polstergeist data files (ApplicationSettings, Requests, Reports) are stored in the polstergeist Data directory located under the directory where the polstergiest application is installed (Normally this will be "c:\Program Files\Polstergeist\Data\". You can easily protect this data by using the security features built into the windows operating system platform. For example, on a windows XP machine you could do the following.

This will prevent anyone other than the signed in user from accessing the contents of this directory. For other security optons please visit the microsoft windows security page.

 

Adding Services to Catalogs

We have received feedback from several user's wanting to create there own catalogs but for the time being we will be mapping all services ourselves. Let us know if there are any web sites you would like us to include in our service catalogs by sending an email to support@polstergeist.com and we will do our best to accomodate your needs. Please supply a link to the target web page along with information detailing the data you would like us to extract.

 

Programming Interface

We will walk through an example of creating and submitting a request from an external program  to get Stock Option quotes. First we will create the request manually using the Polstergeist Browser.

Select the options service from the services tab on the browser and follow the normal steps to submit the request.

Service Selection 

 

An xml request file will be created and placed in the "C:\Program Files\Polstergeist\Data\Requests\" directory. Use this file as a template when generating a request from within your program.

Request Template1 

You can exclude the Request Catalog and Request Name attributes as these are not required by the extraction engine. Your request would then look as follows.

Request Template2 

Your program might generate this template from inside a do loop and replace the Parameter Value attribute with a different Stock Symbol each time through the loop. To submit a request simply drop the generated xml into the polstergeist "C:\Program Files\Polstergeist\Data\Input\External\" directory under the filename of your choice. The extractor will process the request and place the resulting response file in the directory "C:\Program Files\Polstergeist\Data\Output\External\ and give it the same name as the corresponding input file. Your program can poll the output directory at preset intervals to determine if the extraction job has completed. 

The following shows a partial view of the response that was generated from the request we submitted manually from the Polstergeist Browser at the start of this example. Output from Polstergeist generated requests can be found in the directory "C:\Program Files\Polstergeist\Data\Output\Reports\". The response generated for an external request will look exactly the same with one exception; the Sponsored attribute on the Record element won't exists. The record element will be repeated multiple times for multi record responses. In this example there were actually eight records for various strike prices in January and February.

Output Template 

Data requirements for a program are often quite different than those of  a person casually browsing the web.  For example, where as a user might only be interest in the first few pages of an autos site a researcher might require several pages (or all available pages) of data. An interactive user might be interested in the current stock price of a handful of stocks where as a researcher might need large volumes of historical stock quotes. For this reason Polstergeist will be creating a special Research Catalog where we create data extraction services targeted at the the programming community. For the time being this will be a free service but eventually we plan to charge a subscription based fee for users who wish to interact with the polstergeist engine programmatically.

 

FAQ

Q: On some requests the extractor does not appear to be doing anything while it is collecting data. Why is this?

A: Some services download content directly rather than navigating through a web site in which case Polstergeist simply displays an empty Extractor window while it retrieves the data. Content derived from RSS news feeds is often collected in this way and it may appear that the extractor is locked up when it is in fact busy extracting data. Simply give the job some time to complete; the greater the number of services in a request the longer it will take to collect the content for your report. In general, if you have a high speed internet connection no data extraction request should take more than 3 minutes to complete.

Q: Why does the message "Data Not Available" appear on certain reports?

A: A web site from which we are attempting to extract data may be experiencing problems. It is also possible that the data being requested is not available from that particular site. For example an Airline may not fly to a particular destination for which you are requesting price information. It is also possible that the web site has changed and our data maps are out of date in which case we will remap the site when we discover the problem. The polstergeist Team regularly tests all the services in our catalogs so any data mapping issues will be resolved fairly quickly.

Q: What does the "Cancel Remaining Items" button on the Data Extractor do?

A: It lets the user stop the currently processing request. All services that were not yet completed at the time the Cancel button was pressed will contain a "Canceled by User" message in the report. Services for which the corresponding extractor window has already closed will contain the expected report data. This can be useful if it is obvious that a particular extractor window is simply waiting to timeout and all the other windows have already closed. An example might be a window displaying an error message.

Q: Can services from different Categories be combined into a single request?

A: Yes, you can combine up to 50 services into a single requests and they can be from any services residing in the same Catalog.

Q: Why do you have a Catalog selection list if you have only one Catalog (United States)?

A: Currently we have only one catalog but as we expand we will be adding additional catalogs. Perhaps for other countries or localized catalogs for a particular city. We might also begin to create specialized catalogs in order to keep our content managable as we grow.

Q: Do I have to wait for an extraction job to complete before using the Polstergeist Browser for other tasks?

A: No, you can simply minimize the extractor after you submit a request and then perform some other task while you wait for the request to complete. For example, you might view content from a previously submitted report, submit another request or simply do some web browsing from the browser tab. You will be alerted once your extraction report has been completed.

Q: Is there a way to turn off the Report Completion Alerts?

A: Yes, There is an option you can set in the ApplicationSettings.xml file located in the Settings subdirectory of the directory where you placed the polstergeist application. By default this will be C:\Program Files\Polstergeist\Data\Settings\. Open the file using either an xml editor or a simple text editor like notepad. Look for the section <ReportAlerts>True</RetportAlerts> and replace True with False.

Q: My extraction requests seem to time out before they are completed; is there anything I can do?

A: You probably have a slow internet connection. Open the ApplicatonSettings.xml file (located in C:\Program Files\Polstergeist\Data\Settings\)  using a text editor such as NotePad. Look for the section <TimeoutInterval>60</TimeoutInterval>. By default Polstergeist will close an extractor window after detecting 60 seconds of inactivity. Try increasing to TimeoutInterval.

Q: Can I  install the Polstergeist application in a directory other than C:/Program Files/Polstergeist ?

A: Yes, you can install it and then copy all files and sub directories to another directory. Alternatively you can download the application components individually and place them in your directory of choice. Then point a shortcut to the Launcher.exe file which is the proper entry point for the application. The application components are stored at "www.polstergeist.com/download". The components you will need are as follows. "Launcher.exe", "DataBrowser.exe", "DataExtractor.exe", "Common.dll", "RequestParameters.exe".