Current location - Trademark Inquiry Complete Network - Futures platform - Central database design
Central database design
6.5438+0 database

According to the development requirements of the system and the functions and functions of the database, it can be divided into three categories: risk inquiry, risk assessment and system management (Sashixuan et al., 2000). The main data are shown in Table 5.5.

Table 5.5 Main Data Sheet of Risk Management System for Overseas Oil, Gas and Metal Mineral Resources Development

sequential

5.2.2.2 data warehouse

Oil price data come from the websites of Energy Information Administration (EIA), China Petroleum and Natural Gas Corporation (CNPC) and WSJ. The oil price series itself is an irregular time series, and the oil price data has the following characteristics.

(1) data consistency is poor.

Oil price data formats are diverse and redundant, which are mainly reflected in the following aspects: the data formats used are different and the subsystems are relatively independent. In the case of websites acting alone, there is generally no problem, but if the data of these different systems or different periods are centralized and comprehensively utilized, the data may be incomplete, inconsistent or repeated.

(2) the dispersion of data storage

There are many sources of oil price data and lack of unified management, and there is no corresponding automatic crawling operation of web data to realize the localization operation of data.

(3) Insufficient development of data resources

Large-capacity data leads to insufficient development and utilization of data resources, and lack of in-depth analysis, synthesis, refining, mining and display of the obtained data, such as the metadata of futures contracts formulated by various analysis institutions, which makes it difficult to re-develop and utilize rich statistical data resources.

According to the types of oil and gas products, the contract date of oil and gas products, the price types of oil and gas products and the price differences of oil and gas products in different markets, we can deepen our understanding of oil price trends. This time-varying, unmodifiable and integrated nature of oil price makes us understand the original data from multiple angles and truly reflect its characteristics, and also makes us find it necessary to use an integrated technology to accurately predict oil prices.

The construction process of data warehouse is gradually realized from bottom to top, as shown in Figure 5. 13.

Figure 5. 13 Data Warehouse Construction Process

1) data source.

A. the complexity of the data source. Data are scattered in database management systems, spreadsheets, e-mail systems, electronic documents and even paper. Among the three data sources that need to be collected by the system, the oil price-related events stored on the EIA website are updated slowly. Although the oil price data of each market will be downloaded every day, week, month and year, the format of table fields will often change after downloading, which makes it more difficult to achieve the requirement of automatically obtaining data and downloading it to local automatic storage. In addition to the above three pieces of data, IP addresses with excessive traffic will also be blacklisted, making it impossible to download them to local storage. It takes a lot of energy to build a unified model for these data.

B. Validity of data. Due to the limitation of experience, how to deal with the null value of data, the format of time fields in different time intervals, the problems that should be paid attention to when storing data and so on. If the application fails to check the validity of the data, it will have a great impact on the multidimensional display of the data, so it also comes down to the data quality of the data source.

C. data integrity. The data on the data source is not so obvious or easy to obtain. Oil price is highly sensitive data, so although various websites provide daily, monthly or annual data of various oil trading markets, their completeness cannot be completely guaranteed. According to different enterprise policies, it sometimes takes a lot of energy to obtain data. Therefore, different data sources should be established to ensure the integrity of the obtained data.

2) Data processing.

Efficient cube display is inseparable from the accurate acquisition of data from the underlying data sources, or data understanding and data cleaning. Therefore, the system has achieved the expected requirements in metadata acquisition, processing, warehousing and cube display.

A.ETL. This function is one of the cores of the whole oil price data warehouse. Its main function is to extract data from the relevant system tables according to the predefined data table correspondence, and finally load the correct data into the source data of the data warehouse after data cleaning and conversion, which will be the basis for future application.

B. data conversion. This function is to transform data according to defined rules in the process of data extraction, which avoids the diversity of data in analysis and ensures the consistency of data.

C. data integration. This function is mainly to integrate the source data of the oil price information data warehouse system in a thematic way according to the predefined calculation logic and store it in a new data structure.

3) data storage.

Star schema is an important logical structure of multidimensional display in data warehouse development. Several important characteristics of star model are dimension, degree and attribute, which are expressed as fact table and dimension table in practical application. In the oil price data, the spot price table of each market is the fact table of the data warehouse, and the oil type and contract date are the dimension tables.

The star model design scheme of oil price data warehouse is as follows:

A. fact sheet. Taking the EIA spot price list in database tables (including daily, weekly, monthly and chronological tables) as the fact table in data warehouse, several star models, namely constellation models, are formed according to different time dimensions. In these price lists, market number, oil and gas product type, futures contract date and price unit measurement number are used as primary keys and foreign keys to connect other dimension tables, forming the basis of multi-dimensional display linkage, and oil price data and other factual data are used as record data and main output results.

B. Dimension table. According to market, oil products, price data, weights and measures and event types, it is the angle and goal of multidimensional analysis in oil and gas data warehouse.

Figure 5. 14 Take EIA daily futures data table as an example to build a star model, and the model structure diagrams of other different time dimensions are basically the same as this figure.

Figure 5. 14-day futures price star model taking EIA data as an example

On the basis of star model design, the prototype design of business data storage (ODS) in data storage is improved, and the data environment of the middle layer between DB and DW is provided, which can realize business data integration and data exchange between systems.