Amazon Redshift is a data warehouse product which forms part of the larger cloud-computing platform Amazon Web Services. It is built on top of technology from the massive parallel processing (MPP) data warehouse company ParAccel to handle large scale data sets and database migrations. Redshift uses parallel-processing and compression to decrease command execution time. This allows Redshift to perform operations on billions of rows at once.
Connecting to the Database¶
Mammoth allows you to connect to your Database and get the data into Mammoth.
Once the connection is established, you will be presented with a list of tables and views in that database.
- Select the desired table to get a preview.
- Write your own SQL query or run a test query and preview the result.
- Click on Next
After you have selected the table you want to work on, you get options to configure it as follows -
- Rename it in the data pull scheduling window.
- Save it in a desired location in the the Data Library from Adding file to option.
Scheduling your Data Pulls¶
You can start retrieving the data now or at a specific time according to your choice. You can also schedule the data pull in order to get the latest data from your Database at a certain time interval - every few minutes, daily, weekly or monthly.
On every data pull from your Database, you also have an option to either replace the older data or combine with older data.
On choosing Combine with older data option, you will get an option to choose a unique sequence column. Using this column, on refresh, Mammoth will pick up all the rows that have greater value in this column than the previous data pull .
- Make sure that Mammoth’s public IP address is added to your whitelist.
- Mammoth’s public IP is displayed on the create connection window.