DataStar: Data Integration

H2 jump links will appear on live website

H3 jump links will appear on live website

DataStar users typically will want to use data from a variety of sources in their projects. This data can be in different locations and systems and there are multiple methods available to get the required data into the DataStar application. In this documentation we will describe the main categories of data sources users may want to use and the possible ways of making these available in DataStar for usage.

If you would first like to learn more about DataStar before diving into data integration specifics, please see the Navigation DataStar articles on the Optilogic Help Center.

Overview

The following diagram shows different data sources and the data transfer pathways to make them available for use in DataStar:

Local Data – data that is available locally on a DataStar user’s computer, typically Excel workbooks, CSV files, and SQL databases.
‍External Data – data that sits outside of a user’s local computer. This usually falls into one of the following categories: ‍
1. Cloud Storage: examples are Google Sheets, OneDrive, and Amazon S3 ‍
2. Cloud Data Warehouse: examples are Google Big Query, AWS Amazon Redshift, Snowflake, and Microsoft Azure‍
3. ERP and Planning Systems: examples are SAP, Kinaxis, Oracle, and Infor

Optilogic Platform – data can be stored in the user’s account on the Optilogic platform. The files available in a user’s account can be viewed using the Explorer application. To learn more about the Explorer, please see this Getting Started Help Center article.

DataStar – the DataStar application is also hosted on the Optilogic platform, and this is where we want to be able to use the data from the different sources.

For local data, the steps to go from the files sitting on the user’s computer to using them in DataStar are as follows (these are explained in more detail in the “Local Data” section below):
1. Upload the files to the Optilogic platform. This can be done one file at a time or multiple files simultaneously, and either manually or programmatically.
  - Please note that for local PostgreSQL databases users need to programmatically connect to the database, extract the data and load this data onto the Optilogic platform.
2. Set up a Data Connection from within DataStar to the files now present in the Explorer so that DataStar can access the files.
3. Use Import tasks using this created Data Connection as the source to pull data into the DataStar project(s) that need to use this data.

For external data, there are 2 main methods that can be used:
1. Utilize ETL tools, automation platforms, or custom scripts to push data onto the Optilogic platform. Here the user starts the process of data extraction and upload to the Optilogic platform from within the tool / platform / script, so it is a “push” action from the customer side. See also the External Data - "Push" from Customer Side section further below for more details.
  - Some of these tools can connect to a variety of data sources and extract data in CSV and/or Excel format, such as Talend and Azure Data Factory. These extracted files can then be pushed onto the Optilogic platform by the middleware. Once available there, the steps from bullets 5b and 5c can be taken to make the data available for use in DataStar.
  - Data extracted by the middleware tool can also be landed directly into the Project Sandbox of a DataStar project or in a PostgreSQL database for staging.
2. Scripts and utilities from the Optilogic platform side can be used to pull data from various sources onto the platform and typically copy the data straight into the Project Sandbox of a DataStar project. These scripts can exist and be run 1) locally on a user’s computer, or 2) on the Optilogic platform in the user’s account (outside of DataStar), or 3) from inside DataStar leveraging Run Python and Run Utility tasks. Several template scripts and utilities are available currently in the Resource Library application on the Optilogic platform; for more details, see the External Data - "Pull" from Optilogic Side section further below.

Local Data

We will dive a bit deeper into making local data available for use in DataStar building upon what was covered under bullets 5a-5c in the previous screenshot. First, we will familiarize ourselves with the layout of the Optilogic platform:

Access your account on the Optilogic platform by logging into it from optilogic.app or cosmicfrog.com.
Once logged in, the breadcrumb trail in the toolbar at the top will always indicate where in the platform you are working currently. The format is: Organization (always Optilogic) – Team (if working in a Team workspace, otherwise not shown when working in your own account) – Application (Home here) – Project / Model / File name (only shown if anything is open in the currently open application).
When first logging in, you will land on the Home page of the platform where you can get started with videos, training courses, links to useful resources and read the latest Optilogic news.
On the left-hand side the list of applications available on the platform is shown. Most users will be mainly using Cosmic Frog and/or DataStar for their supply chain modelling and data workflow purposes, using the other applications to support the use of these 2 main applications.
The other applications can also be accessed from the list. Note that the order of the applications may be different in your account. You can drag and drop the application icons to re-arrange the order. In case not all applications are shown in the list, you can click on the icon with 3 horizontal dots to show all applications.
The Explorer sits across all applications on the platform. It can be opened and closed by clicking on the chevron icon at the left top. When open:
1. The folders present in the user’s account are shown. These can be expanded by clicking on the chevron icons to the left of the folder names.
2. Quick search, filter, share, zoom to file, collapse all, and refresh options are available to help the user find their file(s) of interest quickly.

Next, we will cover the 3 steps to go from data sitting on a user’s computer locally to being able to use it in DataStar in detail through the next set of screenshots. At a high-level the steps are:

Upload files to the Optilogic platform.
Create Data Connections to these files from within DataStar.
Create Import tasks in the DataStar project(s) using the Data Connections as the data source.

Upload Data to Optilogic Platform

To get local data onto the Opitlogic platform, we can use the file / folder upload option:

A screenshot of a computerAI-generated content may be incorrect.

Open the Explorer if not yet open by clicking on the chevron icon at the top left.
Right-click on the folder you want to upload files to.
Select Upload Files from the context menu. The following form comes up:

The folder that was right clicked on will be the one the files will be uploaded to.
If wanting to upload 1 or a few files from one folder, choose the Add Files option.
If wanting to upload all or almost all files from one folder, choose the Add Folder option.

Select either the file(s) or folder you want to upload by browsing to it/them. After clicking on Open, the File Upload form will be shown again:

You can go back and re-select what you want to upload by using the Add Files or Add Folder buttons.
A summary and list of the files to be uploaded is displayed.
Two of the files have a warning and are highlighted in yellow, this is because files of the same name already exist in the folder we are about to upload these files into. One option is to rename the file so there is no naming conflict; click on the Rename icon to do so.
Instead of renaming a file with a conflicting name, it can also be removed from the list so it will not be uploaded.
Another option to resolve file name conflicts is to allow overwriting of already existing files with files of the same name that are going to be uploaded. Switch the option to on if overwriting is allowed. If not allowed, then any name conflicts will need to be resolved before the upload can start.
Once files have been selected and any naming conflicts have been resolved (when overwriting is not allowed), the Upload button can be used to start the upload of the file(s).
If at any point the user decides they do not want to go ahead with uploading any files, they can click on the Cancel button.

Note that files in the upload list that will not cause name conflicts can also be renamed or removed from the list if so desired. This can for example be convenient when wanting to upload most files in a folder, except for a select few. In that case use the Add Folder option and in the list that will be shown, remove the few that should not be uploaded rather than using Add Files and then manually selecting almost all files in a folder.

Once the files are uploaded, you will be able to see them in the Explorer by expanding the folder they were uploaded to or searching for (part of) their name using the Search box.

Create Data Connections

The second step is to then make these files visible to DataStar by setting up Data Connections to them:

Within DataStar (open it by clicking on the DataStar application icon in the list of applications on the left), click on the Create Data Connection button.
Give the connection a name and optionally a description, then choose the connection type.
The connection type options are currently: CSV files, Excel Files (beta version), Cosmic Frog models, or PostgreSQL databases.
1. Note that currently only connections to PostgreSQL databases which are hosted on the Optilogic platform can be created.

After setting up a Data Connection to a Cosmic Frog model and to a CSV file, we can see the source files in the Explorer, and the Data Connections pointing to these in DataStar side-by-side:

DataStar is the active application as indicated by its icon having a darker background.
DataStar will be shown in the center part of the platform. On the DataStar start page, the Data Connections tab is open.
The Explorer is also open and is shown on the left-hand side.
The source for the connection named “Empty CF model for DataStar Export” (connection type = Cosmic Frog) is the model named “Empty Model for DataStar.frog”, shown in the /My Files/DataStar folder in the Explorer. DataStar can now access this model.
Similarly, the source for the connection named “Historical Shipments” (connection type = CSV) is the file named Shipments.csv shown in the /My Files/DataStar folder in the Explorer. DataStar can now access this file too.

Import Data into DataStar Projects

To start using the data in DataStar, we need to take the third step of importing the data from the data connections into a project. Typically, the data will be imported into the Project Sandbox, but this could also be into another Postgres database, including a Cosmic Frog model. Importing data is done using Import tasks; the Configuration tab of one is shown in this next screenshot:

A screenshot of a data connectionAI-generated content may be incorrect.

The type of task we are configuring is Import.
First, the Source Data Connection will be configured, this is the data source from which data will be pulled into DataStar.
We can select the Source Data Connection from the drop-down list showing all Data Connections that have been set up (Excel, CSV, Cosmic Frog model or other Postgres database). In the screenshot we are showing the Cosmic Frog model connection and CSV file connection of which we showed the data connections and the files they point to in the previous screenshot.
Next, the Target Data Connection for the import is specified, here a new table named “Shipments_2025” in the Project Sandbox.

The 3 steps described above are summarized in the following sequence of screenshots:

Local Data Refresh

For a data workflow that is used repeatedly and needs to be re-run using the latest data regularly, users do not need to go through all 3 steps above of uploading data, creating/re-configuring data connections, and creating/re-configuring Import tasks to refresh local data. If the new files to be used have the same name and same data structure as the current ones, replacing the files on the Optilogic platform with the newer ones will suffice (so only step 1 is needed); the data connections and Import tasks do not need to be updated or re-configured. Users can do this manually or programmatically:

Manual local data refresh: as described above by using the upload files / folder option from within the Explorer. Note that the user will need to take care of archiving any previously used files themselves in case this is required. Users can choose to do this either on the Optilogic platform or locally. When archiving on the platform, a quick manual way can be to rename the folder with the outdated files, create a new folder with the same name as the old folder had prior to renaming, upload the required files into this new folder (note that this only works well if replacing all files with newer ones). ‍
Programmatic local data refresh: users can use scripts to bulk upload their local files into the desired folder in their Optilogic account or load data from local files straight into a Postgres database (which could be a project sandbox). A Python utility that can be leveraged as-is or used as a starting point to modify to users’ needs is available from the Resource Library:
1. A Python utility which imports local CSV-files straight into a Postgres database on the Optilogic platform is available here on the Resource Library. The utility and its detailed documentation can be downloaded / copied from the Resource Library. This utility has the option of appending uploads to existing tables instead of creating a new table for each CSV file and it is designed for high-speed ingestion while maintaining data integrity and reproducibility.
2. A Python Script for bulk uploading local files to the Optilogic platform with an auto-archive option built in can be found here on the Resource Library. The script and its detailed documentation can be downloaded / copied from the Resource Library. In a nutshell, the script:
  - Requires the user to specify the local folder containing the files to be uploaded and the target folder on the Optilogic platform to upload the files to.
  - Users can configure the script further to their needs with a few user-defined settings:
    - Choose to upload all files, only the CSV and Excel files, or a user-defined list of files from the upload folder.
    - Option to auto-archive files that would give naming conflicts due to having the same name into a different folder prior to overwriting them. By default, the folder to archive files to will have the name of the target folder suffixed with the date; this can be overridden with a user-defined suffix.

External Data - “Push” from Customer Side

This section describes how to bring external data into DataStar using supported integration patterns where the data transfer is started from an external system, e.g. the data is “pushed” onto the Optilogic platform.

Integrating Data Using Optilogic APIs

External systems such as ETL tools, automation platforms, or custom scripts can load data into DataStar through the Optilogic Pioneer API (please see the Optilogic REST API documentation for details). This approach is ideal when you want to programmatically upload files, refresh datasets, or orchestrate transformations without connecting directly to the underlying database.

Key points:

The API uses standard Optilogic authentication
External tools simply perform authenticated API calls
The same endpoints used by DataStar’s native Python utilities are available for external integrations

Please note that Optilogic has developed a Python library to facilitate scripting for DataStar. If your external system is Python based, you can leverage this library as a wrapper for the API. For more details on working with the library and a code example of accessing a DataStar project’s sandbox, see this “Using the DataStar Python Library” help center article. 

Integrating Data Using Direct Database Connections

Every DataStar project is backed by a PostgreSQL database. You can connect directly to this database using any PostgreSQL-compatible driver, including:

ODBC
JDBC
Native PostgreSQL clients
Cloud integration connectors
BI tools or data pipelines that support Postgres connections

This enables you to write or update data using SQL, query the sandbox tables, or automate recurring loads. The same approach applies to both DataStar projects and Cosmic Frog models since both use PostgreSQL under the hood. Please see this help center article on how to retrieve connection strings for Cosmic Frog model and DataStar project databases; these will need to be passed into the database connection to gain access to the model / project database.

External Data - “Pull” from Optilogic Side

Template Scripts and Utilities

Several scripts and utilities to connect to common external data sources, including Databricks, Google Big Query, Google Drive, and Snowflake, are available on Optilogic’s Resource Library:

A screenshot of a search engineAI-generated content may be incorrect.

Go to the Resource Library application while logged in on the Optilogic platform.
Click on the DataStar button at the right top to filter the list of resources for those specific to DataStar.
Optionally, use the Tags drop-down to filter the list further down to find what you are looking for quickly. Here we have selected 2 tags: "Extensibility Tool" and "Utility".
Browse the list and click on a resource of interest to select it, here we clicked on the "Google Big Query Insert Script" resource.
After clicking on it, a description of the resource appears on the right hand-side.
Associated files can be downloaded by using these Download buttons. Each contains a user guide which describes the script / utility and lets users know what they need to configure in them to start using them.
The blue buttons at the right bottom can be used to gain access to this resource. Click on the Copy to Account button to start using the resource on the Optilogic platform. Scripts can for example be viewed, edited, and run in the Lightning Editor application. The Download All option can be used if you want to use the available files locally.

These utilities and scripts can function as a starting point to modify into your own desired script for connecting to and retrieving data from a certain data source. You will need to update authentication and connection information in the scripts and configure the user settings to your needs. For example, this is the User Input section of the “Databricks Data Import Script”:

The user needs to update following lines; others can be left at defaults and only updated if desired/required:

Line 21: update “USERNAME” to the user’s own username
Line 24: update “DATABRICKS” to the name of the DataStar project to import the databricks data into
Line 26 and 27: update the “DATABRICKS_SERVER_HOSTNAME” and “DATABRICKS_HTTP_PATH” to the hostname and path of the user’s Databricks instance
Line 30: update “DATABRICKS_PAT_SECRET” to the user’s Databricks Personal Access Token (PAT) to ensure authenticated access to Databricks
Line 33: update the query to run to extract data from Databricks. This example “SELECT * FROM CUSTOMERS” retrieves all records with all columns from a table named CUSTOMERS
Line 37: update this to the name of the destination table in the DataStar project database

Before you Start

We highly recommend taking some extra time up-front to think through naming conventions of folders (local and on the Optilogic platform), files, and data connections in DataStar before creating workflows that are meant to be repeatable and will need regular data refreshes. Deciding on this beforehand can save a lot of time further down the road.
It is also recommended to think through beforehand how much data transformation and filtering is done as part of the data pull/push process. Our guidance here is as follows:
1. Send raw data onto the Optilogic platform when you want maximum transparency, easier debugging, lineage visibility, and the flexibility to iterate or adjust logic downstream.
2. Transform data upstream when you know you only need a curated subset, want to reduce noise, or are optimizing for performance, cost, or simplicity.

For any questions or feedback, please feel free to reach out to the Optilogic support team on support@optilogic.com.

If you would first like to learn more about DataStar before diving into data integration specifics, please see the Navigation DataStar articles on the Optilogic Help Center.

Overview

The following diagram shows different data sources and the data transfer pathways to make them available for use in DataStar:

Local Data – data that is available locally on a DataStar user’s computer, typically Excel workbooks, CSV files, and SQL databases.
‍External Data – data that sits outside of a user’s local computer. This usually falls into one of the following categories: ‍
1. Cloud Storage: examples are Google Sheets, OneDrive, and Amazon S3 ‍
2. Cloud Data Warehouse: examples are Google Big Query, AWS Amazon Redshift, Snowflake, and Microsoft Azure‍
3. ERP and Planning Systems: examples are SAP, Kinaxis, Oracle, and Infor

Optilogic Platform – data can be stored in the user’s account on the Optilogic platform. The files available in a user’s account can be viewed using the Explorer application. To learn more about the Explorer, please see this Getting Started Help Center article.

DataStar – the DataStar application is also hosted on the Optilogic platform, and this is where we want to be able to use the data from the different sources.

For local data, the steps to go from the files sitting on the user’s computer to using them in DataStar are as follows (these are explained in more detail in the “Local Data” section below):
1. Upload the files to the Optilogic platform. This can be done one file at a time or multiple files simultaneously, and either manually or programmatically.
  - Please note that for local PostgreSQL databases users need to programmatically connect to the database, extract the data and load this data onto the Optilogic platform.
2. Set up a Data Connection from within DataStar to the files now present in the Explorer so that DataStar can access the files.
3. Use Import tasks using this created Data Connection as the source to pull data into the DataStar project(s) that need to use this data.

For external data, there are 2 main methods that can be used:
1. Utilize ETL tools, automation platforms, or custom scripts to push data onto the Optilogic platform. Here the user starts the process of data extraction and upload to the Optilogic platform from within the tool / platform / script, so it is a “push” action from the customer side. See also the External Data - "Push" from Customer Side section further below for more details.
  - Some of these tools can connect to a variety of data sources and extract data in CSV and/or Excel format, such as Talend and Azure Data Factory. These extracted files can then be pushed onto the Optilogic platform by the middleware. Once available there, the steps from bullets 5b and 5c can be taken to make the data available for use in DataStar.
  - Data extracted by the middleware tool can also be landed directly into the Project Sandbox of a DataStar project or in a PostgreSQL database for staging.
2. Scripts and utilities from the Optilogic platform side can be used to pull data from various sources onto the platform and typically copy the data straight into the Project Sandbox of a DataStar project. These scripts can exist and be run 1) locally on a user’s computer, or 2) on the Optilogic platform in the user’s account (outside of DataStar), or 3) from inside DataStar leveraging Run Python and Run Utility tasks. Several template scripts and utilities are available currently in the Resource Library application on the Optilogic platform; for more details, see the External Data - "Pull" from Optilogic Side section further below.