Thursday, November 11, 2010

ETL Questions Part-IV

151. What are data-marts?
· Data Marts are used to exchange data between different BW systems or to update data within the same BW system (Myself Data Mart). Here, the InfoProviders that are used to provide data are called Data Marts.
152. Which one is more normalized; ODS or InfoCube?
· InfoCube is more normalized than ODS.
153. What is replication of DataSource?
· Replication of DataSource enables the extract structure from the source system to be replicated in the BW.
154. What are the quality checks for inefficient InfoCube designs?
· Huge Dimension tables make an InfoCube inefficient.
· The query takes a long time.
155. Why is star schema not implemented for ODS as well?
· Because ODS is meant to store a detailed document for quick use and help make short-term decisions.
156. Why do we need separate update rules for characteristics on each key figure?
· If the requirement specifies a different need for each characteristic then we have separate update rules for each of the characteristics.
157. What is the use of Hierarchies?
· Efficient reporting is one of the targets of using hierarchies. Easy drilldown paths can be built using hierarchies.
158. What is "Referential Integrity"?
· A feature provided by relational database management systems (RDBMS) that prevents users or applications from entering inconsistent data. For example, suppose Table B has a foreign key that points to a field in Table A.
o Referential integrity would prevent from adding a record to Table B that cannot be linked to Table A.
o Referential integrity rules might also specify that whenever you delete a record from Table A, any records in Table B that are linked to the deleted record will also be deleted. This is called cascading delete.
o Referential integrity rules could specify that whenever you modify the value of a linked field in Table A, all records in Table B that are linked to it will also be modified accordingly. This is called cascading update.
159. What is a Transactional InfoCube and when is it preferred?
· Transactional InfoCubes differ from Basic InfoCubes in their ability to support parallel write accesses. Basic InfoCubes are technically optimized for read accesses to the detriment of write accesses. Transactional InfoCubes are designed to meet the demands of SEM, where multiple users write simultaneously into an InfoCube.
160. When is data in Change Log table of ODS deleted?
· When requests loaded into ODS object are neither required for delta update nor for initialization, they can be deleted. If delta initialization for update exists in connected data targets, the requests have to be updated first before the data can be deleted.
161. How is the data of different modules stored in R/3?
· Data is stored in multiple tables in R/3 based on ERM (Entity Relationship model) to prevent the redundant storage of data.
162. In what cases do we transfer data from one InfoCube to another?
· Modifications can't be made to an InfoCube if there is data present in the InfoCube. If we want to modify an InfoCube and no backup for data exist then we can design another InfoCube with the parameters specified and load data from the old InfoCube.
163. How often do we have a Multi-layered structure in ODS stage and in what cases.
· Multi-layered structure in ODS stage is used to consolidate data from different DataSources.
164. How is data extracted from systems other than R/3 and Flat files?
· Data is extracted from systems other than R/3 and flat files using staging BAPIs.
165. When do tRFC and IDoc errors occur?
· tRFC and iDoc errors- when you load data , these are connection specific and if the source system is not set properly or is interrupted , you get these errors.
· Intermediate Document (IDoc) is a container for exchanging data between R/3, R/2 and non-SAP systems. IDocs are sent in the communication layer by transactional Remote Function Call (tRFC) or by other file interfaces (for example, EDI). tRFC guarantees that the data is transferred once only. Was not able to find out when the errors occur.
166. On what factors does the loading time depend on?
· Loading time depends on the work load both on the BW side and source system side. It might also depend upon the network connectivity.
167. How long does it take to load a million records into an InfoCube from an R/3 system?
· Depending on work load on BW side and source system side loading time varies. Typically it takes half an hour to load a million records.
168. Will the loading time be same for the same amount of data for non-SAP systems like Flat files?
· It might not be the same. It depends on the extraction programs used on the source system side.
169. What is mySAP.com?
· SAP solution to integrate all relevant business processes on the Internet. mySAP.com integrates business processes in SAP and non-SAP systems seamlessly and provides a complete business environment for electronic commerce.
170. How was Data modeling done in your project? Explain
· Initially we study the business process of client, like what kind of data is flowing in the system, the volume, changes taking place in it, the analysis done on the data by users, what are they expecting in the future, how can we use the BW functionality. Later we have meetings with business analyst and propose the data model, based on the client. Later we give a proof of concept demo wherein we demo how we are going to build a BW data warehouse for their system. Once we get an approval we start requirement gatherings and building your model and testing follows in QA.
171. As you said you have worked on InfoCubes and ODS, Which one is better suited for reporting? Explain and what are the drawbacks and benefits of each one?
· Depending on the type of report the data is stored in InfoCube or ODS. BW is used to store high volumes of data and faster reporting. InfoCube is used to store normalized data. Master Data and transaction data are stored in InfoCube as per the Extended Star Schema using SIDs. The reporting is fast.
· ODS stores data in more detail utilizing its structure of transparent tables. Reporting on this will be slow. ODS is better used for RRI.
172. How do you measure the size of InfoCube?
· In number of records
173. What is the difference between InfoCube and ODS?
· InfoCube is structured as per Extended Star Schema with the fact table surrounded by different dimension tables which connect to SIDs. And the data can be aggregated in the InfoCubes. ODS is a flat structure and does not use the star schema concept and has detailed data in transparent tables.
174. What is the difference between display attributes and navigational attributes?
· Display attribute is one which is used only for display purpose in the report. Where as navigational attribute is used for drilling down in the report. We don't need to maintain NAV attribute in the InfoCube as a characteristic (that is the advantage) to drill down.
175. Data is uploaded twice into InfoCube. How to correct it?
· You can delete it by the Request ID.
176. Can you add a new field at the ODS level?
· Yes.
177. Can many DataSources have one InfoSource?
· Yes. For example, for loading text and hierarchies we use different DataSources but the same InfoSource.
178. Apart from R/3, which legacy db you used for extraction?
· Access, Informatica
179. There were problems with delta loads. The DataSources and transfer rules were re-activated and transported from both DEV systems to the Production systems. When the jobs are scheduled, there is an error that the delta update to the InfoCube has been invalidated because a previous delta load has been deleted from the InfoCube. All data from the ODS objects and the InfoCube then ran the initialization of delta job to restart the process has already been deeted. The load to the ODS objects completes successfully but the subsequent load to the InfoCube from the two ODS objects fails with this error.
· Prior to deleting all the data did you run the last deltas from the delta collective runs into the InfoCube? It sounds as if you may have left the delta data in the queue and re-initialized the delta process. Thus when you go to load the delta data it doesn't want to load b/c it is looking for it’s predecessor delta that you deleted.
180. I had initialized an LIS structure for billing and when I ran the delta request it is giving me an error stating that this is a duplicate document and my user wants the report urgently.
· What I did was,
o Deleted setup data (LBWG in R3)
o Deleted delta queue (RSA7 in R3)
o Generated new queue (for 2LIS_13_VDITM it is OLI9BW transaction)
181. I have a requirement in one of the existing InfoCube to change the Attribute as Navigational. I am sure that I have to check the Navigational option in the Attributes tab of respective characteristic and I have to check the option in the InfoCube also. My concern is, this InfoCube already has millions of records in production, I wonder is there any way to realign it with out reloading the InfoCube. How about realigning master data after making those changes... Is it mandatory to reload the InfoCube or any work around is available.
· All you need to do after selecting and activating the object and InfoCube is run a Hierarchy attribute change run. That should make the Nav attr. visible for reporting. In my first project I created an FI InfoCube with the company info. customized the FI AR InfoCube and created a MultiProvider for the users to drill down on customer info and ledger accounts. I also customized the sales overview for transactional data
182. When Extracting Sales Data using V3 Collective Run using LBWE job control, no data is being extracted (Nothing is being shown in RSA3). When filling up Setup Table for OLD Documents, we are able to see the Extracted Data records in RSA3. What may be wrong? And what should be the actual procedure to see data consistently in RSA3 so that BW can pull records from R/3.
· In RSA7 on the source system do you see your DataSources under queue maintenance and are they green? Also, did you run OLI7BW to setup your data in the statistical queue for initialization? And finally did you run your init from BW to initialize the delta process and get the initial load into BW? If you have done all these things you should be collecting deltas in the delta queue under RSA7.
183. Where the PSA data is stored?
· In PSA table.
184. What is data size?
· The volume of data one data target holds(in no. of records)
185. Different types of InfoCubes.
· BasicCube,
· Virtual Cube
o RemoteCube
o SAP RemoteCube
o Virtual InfoCube with services
186. What is an InfoSet?
· InfoSet is an intersection of multiple InfoProviders. They can be made of ODS and InfoObjects only.
187. If there are two DataSources how many transfer structures are there?
· Two in R/3 and Two in BW
188. What are indexes?
· Indexes are database indexes, which help in retrieving data fast.
189. Is it necessary to initialize each time the delta update is used?
· No
190. After the data extraction what is the image position?
· After image
191. What are Authorizations?
· Profile generators.
192. Can a Characteristic and InfoObject be InfoProvider?
· Yes
193. What is data Integrity and how can we achieve this?
· Data Integrity is about eliminating duplicate entries in the database and achieve normalization.
194. What is index maintenance and what is its purpose?
· Indexing is a process by which the address of data is stored. It helps easier access to the data.
195. When and why use InfoCube compression?
· When the data in the InfoCube is not going to change ever and if it is occupying a lot of space then we compress the InfoCube. The data that is compressed cannot be altered or deleted except through selective deletion, hence we have to make an informed decision on compression. This compression can be done through process chain and also manually.
196. How can Business Content be enhanced? And why do we need to enhance the Business Content?
· We can enhance the Business Content by adding fields to the Extract Structures delivered by SAP BC. We may need to enhance the BC because we need to provide fields which are not already in the BC as per the customers needs. Eg: you have a customer InfoCube (in BC) but your company uses attribute for say Apt number. Then instead of constructing the whole InfoCube you can add the above field to the existing BC InfoCube.
197. What is Tuning and why do we do it?
· Tuning is done to increase efficiency. It is done to lower time for:
o Loading data into Data Target
o Accessing a query
o For drilling down in a query, etc
198. What is a MultiProvider and how do we use MultiProvider?
· MultiProvider can combine various InfoProviders for reporting purposes. We can extract data from an ODS, InfoCube, InfoSet, InfoObject, etc in any combination.
199. What are scheduled and monitored data loads?
· Scheduling of data load means to schedule the loading of data for some particular date and time. It can be done from the scheduler tab on Create InfoPackage. Data Loads are monitored using transaction RSMON.
PERFORMANCE TUNING
200. What are the data load tuning one can do?
· Load balance on different servers
· Indexes on source tables
· Use fixed length files if data is loaded from flat files and put the file on the application server
· Use content (SAP delivered) extractor as much as possible
· Use “PSA and Data target in parallel” option in the InfoPackage
· Start several InfoPackages parallel with different selection options
· Buffer the SID number ranges when lot of data is loaded at once
· Load Master Data before loading Transaction Data
· Watch out the ABAP code in transfer and Update rules – This might slow performance
201. What are the general tuning guidelines?
· Archive and delete old data
· Use line item dimensions for large dimensions
· Use MultiProviders to parallel query on the basic InfoCubes
· Use the BW statistics InfoCube to monitor performance
· Reporting authorizations slow the performance
· Web reporting is faster than BEx reporting
· Use the aggregate hierarchies to minimize the roll-up time
· Use parallel upload and activation of ODS objects
· Disable the BEx reporting flag if ODS is not used for reporting
202. Ways to improve the performance
· Define as many dimensions as possible.
· Create aggregates
· Check and define Line Item Dimension

No comments:

Post a Comment