Welcome to Encycogov
 
What is corporate governance?
 About Encycogov - FAQ
 Announcements
 References
 

Key topics
 The big picture
 Stock price formation
 Fundamental value analysis
 
International corp. governance


Incentive mechanisms
 Decision system
 
Performance monitoring
 Incentive based compensation
 Bankruptcy system
 Ownership structure
 Creditor structure
 Capital structure
 Market for corporate control
 
Labor market competition
 
Product market competition


Related topics
 
Transaction cost economics
 Positive economics


 

 


Review of Piranhaweb version 1.5 from Thomson Financial
by Henrik Mathiesen, 18/3/2001


Introduction
Piranhaweb is one of several web-based information retrieval systems on corporate information that Thomson Financial is offering to financial analysts, portfolio managers, and increasingly to academics (www.piranhaweb.com). This review looks at the system from an academic point of view and is therefore focusing on issues that are of particular relevance for academics that are doing empirical research on firms. Much of the review is given in terms of video screen-captures, because it is difficult to understand the benefits and problems of an information retrieval system until it is seen in action.

Which data are available?
The following databases are accessible by Piranhaweb when subscribing to the entire system:

  • The Worldscope Database contains standardized financial, statistical and market information for over 23,000 active global companies from over 55 countries worldwide. Worldscope data covers up to 15 years of data and has currently more than 1000 data items or variables. View sample sheet.
  • The SEC Database contains standardized financial, statistical and market information for over 11,000 actively traded U.S. companies. The data covers up to 15 years and has currently about 700 data items. View sample sheet.

  • The Compustat Database contains standardized financial, statistical and market information for over 10,000 actively traded U.S. companies, as well as over 10,900 inactive U.S. companies. This data go 20 years back and has currently about 500 variables. View sample sheet.

  • The Extel Database contains “As reported" fundamental financial data for approximately 15,000 quoted companies from over 55 countries worldwide. Extel provides data 15 years of historical data from the balance sheet, profit/loss, and cash flow statements and has currently about 1100 variables. View sample sheet.

  • The IBES Database contains Current I/B/E/S Summary and Detail Estimate coverage for over 18,000 companies in 60 countries worldwide. More than 850 firms contribute data to I/B/E/S, from the largest global houses to regional and local brokers. Estimates for EPS, DPS, Sales, CPS, and more are provided for up to 5 years forward. The database has currently about 200 variables. View sample sheet. Current and history.

  • The US Pricing Database provided by Interactive Data Corporation (IDC), contains security pricing, dividends and earnings data for 32,000 US equity securities.  Updated daily after the market close, this data covers up to 10 years of “rolling” history. About 50 variables are available currently. View sample sheet.

Although many data items or variables are listed for each database it does not mean that they are available for every firm in the database and for each year. Whether it is available depends on what the law in a particular country/jurisdiction require the firm to publish. Furthermore, some items are relevant for some industries but not for other. Many data items are also created from calculations of other variables. The point is that the databases are not as voluminous as they appear to be at first glance. On the other hand the financial databases are not the only data available in the Piranhaweb system. The system also provides access to hundreds of thousands of publicly available filings from tens of thousands of firms plus numerous of articles from various financial newspapers and bulletins.

How are data accessed?
The data can be accessed in two different ways. One is to use a standard web browser and the other is to use the Piranhaweb toolbar plug-in for Microsoft Excel. Click links to view video screen-captures on how to capture data: 1) Data by Excel toolbar . 2) Data by browser .

The most exciting feature of the Piranhaweb system is the Piranhaweb toolbar (version 1.6.0.0) for Excel, which enables an intuitive and easy download of data directly to an Excel spreadsheet. A onetime installation and configuration of the toolbar is necessary before it can be used. The toolbar has a button that launch a wizard that guides you through a few simple steps that define which data-variables from which firms that should be downloaded to the Excel spreadsheet and how. No knowledge of the underlying data query language is required. The data-wizard automatically generates all the complicated code that makes it possible for Excel to retrieve data from the Piranha web servers. This simplicity does not come at a great cost of functionality. The wizard allows you to specify, which database to download from, the portfolio of firms, the data-variables, the dates of data-variables, the headings of data-variables, the currency translations if any, whether to make corrections for stock splits and other capital changes, the measurement scale, and the data input location in the spreadsheet. The data can be downloaded either as time series or as cross-sectional data (Click to view sample spreadsheet of cross-sectional  or time-series (pooled) data made by the Piranhaweb toolbar). One of the mayor benefits of the Piranhaweb toolbar is that it is easy to use. After a few hours you will know enough about the toolbar to be able to download data to Excel in a format that it easy to import into statistical programs, such as, the SAS System from SAS Institute (Click to view code for SAS System, version 8 that enables automatic import off data from an Excel spreadsheet created by the Piranhaweb toolbar; use it as you please).

System limitations and system stability
Although it only takes a few hours to learn how to get data into Excel in a format that can is ready for import to statistical programs like SAS it will take weeks of usage to learn the limitations of the Piranhaweb solution. To test these limitations I downloaded more than 15 million data entries during three weeks using the Piranhaweb toolbar (one data entry compares to a number or a text in one cell in the Excel spreadsheet). Academic research on firms is typically based on the selection of a representative portfolio of firms whose characteristics (e.g. financial, or contractual) are statistically analyzed in order to validate various economic theories about the relations among those characteristics (Click to see examples of relations that are relevant for research in corporate governance). I used the search feature of Piranhaweb to create a portfolio of 1938 firms from New York Stock Exchange. Specifically, they where all firms at NYSE in 1999 that had strictly positive values of the total assets variable from the SEC database (Click to view a video screen-capture on how to create portfolios of firms using the Piranhaweb search facility). This portfolio was used to produce several Excel spreadsheets containing all 1034 variables from the Worldscope database, all variables from the SEC database, and all variables from the Compustat database. Other spreadsheets was produced using selected variables from the databases IBES, IBES history, Extel, Currency, and Pricing. All these spreadsheets were made as cross-sectional data for the years 1999, 1998 and 1997. Other spreadsheets were made as 10 Year time-series form 1999 using selected variables from selected databases but still containing data from all 1938 NYSE firms. All in all about 15 million data entries, or 200M of data was processed and downloaded. Such an exercise reveals the limitations of the system. One of the limitations is that the Piranha system allows for a maximum of 5000 firms in one portfolio. This was not a problem since my portfolio was well below that limit with only 1938 firms. The 5000 firms limit is very good compared to other systems that I have seen because they typically have a limit of 500 to 1000 firms per portfolio. Another limitation is the processing power of the PC that is used to do the job of downloading data. For some reason the download process consumes an awful lot of PC processing power. If you use the Piranhaweb toolbar to start a job that will download 60.000 data entries it will take about one hour to complete on a PC with a 500Mz Pentium III. During this time it is recommended not to use other PC programs, because the PC will be very slow. The bottleneck is not the Piranhaweb system nor the speed of your internet connection. It is the speed of your PC. If it is twice as fast you will be able to get the data in half the time. To speed the job I used 3 computer to get the data (two windows 2000 servers and one windows 2000 workstation with a Microsoft terminal services client to remote control the two servers).

The first problematic issue that I detected about the Piranhaweb system itself was that it becomes unstable when the size of the data-acquiring job increases. In particular, I cannot recommend that one tries to start a job downloading more than 60.000 data entries on a 500 Mz PC. I actually did manage to complete a job with 147.000 entries on such a machine, but I must have been lucky, because all subsequent attempts to execute similar sized jobs failed. It is quite disappointing when a job fails, because even if it fails in the last minute before completion all your data are lost and you have to start the entire job once again. The reason that a job fails is that the internet connection to the Piranha web-servers are lost. This happens for various reasons, but in most of the cases that I experienced it was, because the Piranhaweb servers was down sometimes for an hour or two. I did numerous of 1 hour, 60.000 entries jobs and I think that one out of seven jobs failed to execute. This means that the Piranhaweb servers are down 2 or 3 times a day. It is probably done deliberately in order to update the databases. Fortunately, Piranhaweb is running on four different locations (e.g. www.piranhaweb.com and www.piranhawebuk.com) and I newer experienced that all of the locations were down at the same time. So whenever a job failed I simply changed the location and restarted the job. Still I believe Thomson Financial should invest some time eliminating this instability (and end-user inconvenience) of the Piranhaweb system. It could be done either by making a recovery function in the Piranhaweb toolbar or by using more redundant web server architectures.

Data-validation, data documentation, and support
With regard to data-validation Piranhaweb offers two different ways to control that the data are what they are supposed to be. One is to compare the same data variables from different databases. If they differ significantly something is wrong. The other is to check suspicious database entries with data from actual reports that have been filed. In particular, Piranhaweb makes available, as pdf image-files, copies of most of the files that the legislation of various jurisdictions requires firms to publish. In this way the Piranhaweb system stores hundreds of thousands of files that can be used for validation or further inquiry. Apart from your own ability to check the quality of the data Piranhaweb is also running various tests themselves to check that the data are correct such as checking that basic accounting identities adds up. However, they can not avoid to overlook something and I have also been able to find figures that did not make sense, such as, a firms with negative insider ownership or ownership above 100%. In such cases one can either delete the observation set or substitute a similar figure from another database or take a look at the pdf file of the relevant filing. In terms of productive customer feedback I believe Thomson Financial will benefit from their academic accounts, because I could imagine that academics are more careful about data-validation and have more time to report to Piranhaweb about errors, than business people such as portfolio managers.

One issue that was a bit disappointing was that the Piranhaweb system seems to have a default configuration that gives you the most resent data available if you ask for data in a particular year that is not available for that year. I discovered this failure, because I downloaded variables for several years. Click to see an example of this error on a time-series of insider ownership from the SEC database. This failure generally means that you have to download at least two years to check that you have the data from the year you queried for. It must be possible to reconfigure the Piranhaweb system to return a N/A message in the relevant Excel cell when you ask for a variable for a year that is not present. It should be mentioned that this error probably is limitted to ownership variables in the SEC database.

With regard to data-variable-documentation Piranhaweb offers some manuals that either can be downloaded as Microsoft word files from the Piranhaweb help web site or ordered in paperback from the Piranhaweb support office. These manuals are fine and answer most questions. However, documentation can never be too good, and it would be nice if Piranhaweb had more numerical examples on how the ‘synthetic’ variables are calculated. By synthetic I mean variables that are calculated from other variables, such as, growth rates or averages of EPS or sales. The documentation should be detailed enough to make it is easy to reproduce such measures for the sake of data-verification. Although Piranhaweb is rich in data variables there is one synthetic measure that is missing but whose presence would add considerable value to academics (and eventually to financial analysts). This measure is the performance measure Tobin’s Q that theoretically is defined as the market value of the firm’s outstanding financial claims divided by the market costs of replacing all the assets represented by the firm’s financial claims. Equity investors can use Tobin’s Q as an indicator of whether or not it will be profitable to finance expansion plans. Tobin’s Q can be approximated by using measures that are available through Piranhaweb (in particular the Compustat database), but it is a very complicated and time-consuming job. Instead of having researchers doing this individually the job should be centralized so that different versions of Tobin’s Q was available through Piranhaweb.[1]

Finally it should also be mentioned that Piranhaweb is backed by a support staff that can be reached by phone. I believe they are open 24 hours a day, because they have clients all around the world. I never experienced any problems getting someone to help me and they were always competent and polite.

Conclusion
Piranhaweb is the best information retrieval system on corporate information that I have seen so far. It has thousands of data-variables for tens of thousands of the largest firms in the world. Moreover the system offers the ability to validate data by comparing the same variables collected from different databases. Alternatively data can be validated directly by searching Piranhaweb’s huge web-based archive of pdf image-files of the various filings that firms are required to publish. The feature I found most impressive was the Piranhaweb toolbar for Microsoft Excel. In less than 2 hours I knew exactly how to get corporate data into a spreadsheet in a format that is ready for import to a statistical program such as the SAS System from the SAS Institute. The Piranha system scales well and I was able to gather more than 15 million data entries or 200M of data during three weekends. However, two issues could have been better. The Piranha web-servers are down too often causing inconvenient loss of data during downloads. Furthermore, it is not so good that the system sometimes return current values in cases where you query for a variable in a year that is not available (this is probably only a problem for the SEC ownership variables). Nevertheless, the overall impression of the Piranhaweb system is very convincing and on a scale from 6 to 0 with 6 being the highest mark it gets 5.

 



[1]               Some key references on how to calculate Tobin’s Q are: Lewellen, G. Wilbur, and S. G. Badrinath (1997). “On the Measurement of Tobin’s Q,” Journal of Financial Economics, 44, 77-122; Lindenberg, E., and S. Ross (1981). “Tobin’s Q Ratio and Industrial Organization,” Journal of Business, 54, 1-32; Lee, Darrell E. and James G. Tompkins (1999). “A Modified Version of the Lewellen and Badrinath Measure of Tobin’s Q,” Finacial Management, 28, 1, 20-31; Chung, K. H., and S. W. Pruitt (1994). “A Simple Approximation of Tobin’s Q,” Financial Management, Autum, 70-74.

 
Copyright © 1999 - 2012 ViamInvest. All rights reserved. Contact Encycogov