Operation of Analytics with bc2adls
In this article:
Create Staging Indexes
If your LS Central and Analytics data warehouse have very large transactional tables (xx million or more transactions) then it is a good idea to create indexes on the staging tables. This makes the ETL processes more efficient.
You can create the indexed by running the Create Staging indexes pipeline.
Before you create the staging indexes you must make sure that no other pipeline is running. If you have very large tables it is a good idea to disable the daily schedule triggers while this pipeline is completing its run.
Please follow these steps:
- In the Azure Data Factory, open the Author option, the pencil icon on the left navigation menu. Here you can see the Pipelines section, with all the available pipelines. There should be 15 pipelines.
- Expand the Pipelines section and 1 - Query Setup folder.
- Select the Create Staging Indexes pipeline.
- Click Add Trigger to trigger a pipeline run.
- In the pipeline run panel enter a value for the MinimumRowCount parameter. It is set to 100.000 by default but you can adjust it if needed.
- Click OK.
The pipeline now runs and you can monitor it from the monitor tab like the others. Once the pipeline has completed successfully you can turn the daily scheduled run trigger on again.
Because you are creating indexes it is important to maintain them so you should create a schedule trigger for the Database Maintenance pipeline as well. It is enough to run this every xx week so you can for example schedule it on the weekends or during a time where neither Scheduled run nor the Report refresh are running.
Resetting data warehouse
With the bc2adls data version, data is not kept any longer from LS Central SaaS in large pre-staging tables. The pre-staging tables now only holds the deltas for each incremental load. Therefore if you need to completely refresh the data in the data warehouse you need to follow the steps below:
Turn off trigger
Before you reset the data it is important to disable the Daily scheduled run trigger to make sure it does not run while you are running the Factory reset pipeline.
Reset all tables being exported with bc2adls
- In Export to Azure data lake Storage page navigate to the Tables section.
- Select all lines in table list (select one, shift click another line and then click on the header selection to select all lines)
- Click Resetfrom the section menu.
- A message window appears that allows you to reset the tables for only the current company or all companies.
- If you are only exporting from one company you can select Current Company but if you are exporting from several companies you can reset them all at once by selecting All Companies.
- Click OK
- A message is displayed with the count of tables that were reset.
- Click OK to close.
The tables have now been reset and the next time the export runs all records from tables that have been reset will be exported. This means that the export will probably take longer than the usual incremental export.
If you are re exporting because you are adding tables or fields you need to export the schema before again before exporting the data.
- In Export to Azure data lake Storage page click the v (downward arrow) next to the Export option in the top menu.
- Select Schema export
- A message will popup about a request being made to the external service. Allow Always is selected and you can click OK
The export goes through all the tables in a Working on it... message box and shows for which table it is currently exporting the schema.
Once you are ready to export the data you can either wait for the next scheduled export or select Export from the menu and wait for all tables to finish exporting before moving on to running the factory reset.
Trigger Factory reset pipeline
Next, you manually trigger a run of the Factory reset pipeline.
- In Azure portal navigate to the Analytics data factory
- Select the Author tab in the left menu and Navigate to 4 - ETL Flows folder
- Select the Factory reset pipeline and Click Add Trigger > Trigger now from the action menu
- Monitor the progress of the pipeline from the Monitor tab and once it has completed successfully you are done.
- Remember to enable the Daily scheduled run trigger again.
You have now completely refreshed the data warehouse data and if you added new staging tables or columns they have now been added to staging tables.
Note: Please refer to the extension guidelines to add new columns and tables to dimension and fact tables.