Joining tables

You can join tables in the same worksheet. The Join function works in the same way as the SQL join clause.

NOTE: You can join two tables or join a table to itself (perform a self join). You cannot use one operation to join more than two tables.

Joining by one column or multiple columns

By default, the Join function uses one key from each table for joining. You can select one column from each table by which the tables are joined.

You can, however, perform a multi-clause join, which is a Join function that uses pairs of join keys.

Follow these steps to join tables:

  1. To join by one column, right click the header of a column in one of the tables that you want to join. A drop-down menu is displayed.

    To join by multiple columns, select the columns in one of the tables that you want to join. Then right click the header of one of the selected columns to display the column menu.

  2. Select Join… in the menu. The JOIN panel appears.

  3. Select the type of Join function to perform. By default, Inner join is selected as the type of Join operation. The following types of Join operations are supported:

    • Inner join
    • Left outer join
    • Right outer join
    • Full outer join
  4. In the Right Table field, click the Down arrow to display a list of tables. Then select the table to join to the left table.

  5. To select the columns to join, you can either click SMART SUGGEST to display column names suggested by Xcalar Design or fill out the Right Columns fields. (The Left Columns fields are filled out for you based on the columns selected before you display the JOIN  panel.)

    (Optional) For a multi-clause join, click ADD ANOTHER CLAUSE to add another pair of columns to join.

    Verify the table names and column names in the Command Preview section.

    The columns in the worksheet are numbered based on the order of the clauses in the JOIN panel. The number, which is enclosed in circle, is displayed in the upper right corner of the column header as illustrated in the following partial screenshot:

  6. Click NEXT.

    For columns to be joined, they must be of the same data type. If the columns in a clause have mismatched data types, an error message is displayed, and a drop-down menu is provided for you to change the data type for one or both columns. If you select Default in the menu, the data type remains unchanged. After you finish changing the data type, click NEXT.

    NOTE: The column for which you change the data type in the JOIN panel retains the original data type in the resultant table. The data type change is only for Xcalar to execute the Join operation.
  7. In the Column Selector section, select the columns from the left and right tables that are included in the resultant table. The following screenshot illustrates how to select columns:

  8. (Optional) Select Keep original tables if you want both tables to remain on the worksheet. If you want to keep only the resultant table, leave this option unchecked.
  9. Enter the new table name.
  10. (Optional) Check the number of rows in the Estimated join size section. Xcalar Design provides the minimum, maximum, and median number of rows in the resultant table. The time required to complete the Join operation is proportional to the number of rows created. Based on the estimates, you can decide whether you should perform the operation at this time.
  11. Click JOIN TABLES. A new table is created as a result of joining the tables with the keys you selected.

    NOTE: If there are duplicate column names in the resultant table, you are prompted to rename the columns before the Join operation takes place.

Renaming columns to avoid column name duplication

Because columns in a table created by a Join operation come from two tables, it is possible for these columns to have the same name and hence cause confusion. To eliminate duplicate column names, the JOIN panel allows you to rename columns by filling out the New Name field.

IMPORTANT: Even if you do not rename a column, you must copy the original name to the New Name field. Do not leave the New Name field blank.

Renaming prefixed columns

Fields that exist in different data sources are distinguished from one another by column prefixes. If the tables being joined have the same prefix (for example, in the case of a self join), and if prefixed columns are in the resultant table, change the prefix for one table or both tables to eliminate the naming conflict.

For example, you can change the prefix for the left table from airlines1 to Left_airlines1 and the prefix for the right table from airlines1 to Right_airlines1. The resultant table will have columns named Left_airlines1::Carrier and Right_airlines::Carrier, which no longer conflict with each other.

The following screenshot illustrates how to change a prefix. In this example, the prefix for the left table remains unchanged, and the new prefix for the right table is manually entered.

Alternatively, you can click the Rename icon for the right table to generate a new prefix to resolve the conflict. For example, if you click the Rename icon for the right table to change the airlines4 prefix, a new prefix (airlines41) will be automatically entered in the New Name field.

TIP: If a lot of prefixes need renaming, follow the instructions in Renaming in bulk to rename them quickly.

Renaming derived columns

Derived fields do not have a prefix. Therefore, you must change the column name to avoid name duplication. In the following example, the derived fields from the left table remain unchanged, and the derived fields from the right table are manually entered.

Alternatively, you can click the Rename icon for the right table to generate a default new name to resolve the conflict. For example, if you click the Rename icon for the right table to change DayOfWeek_integer, a new name (DayOfWeek_integer1) is automatically entered in the New Name field.

TIP: If a lot of derived fields need renaming, follow the instructions in Renaming in bulk to rename them quickly.

Renaming in bulk

Xcalar Design provides a convenient way for you to specify multiple new names. The following steps are the same for both prefixes and derived field names:

  1. Click the Rename in bulk icon as shown in the following screenshot to display a pop-up menu.

  2. In the menu, you can click Copy All to copy existing names to the New Name field.

    Alternatively, type a string under Copy & Append. The string is appended to each existing name before the name is copied to the New Name field. In this example, the string -left is appended.

    You can enter another string to overwrite what is currently in the New Name field. For example, if you want another string instead of -left to be appended, simply type the new string in the pop-up menu. The change is reflected in the New Name column immediately.

Go to top