Creation date Minimum version
2017-8-30 1.2.2

Implementing EXISTS and NOT EXISTS Correlated Subqueries in Xcalar

This article describes how correlated subqueries with EXISTS and NOT EXISTS clauses are implemented using the Xcalar JOIN operator. Xcalar’s powerful relational compute engine scales linearly to handle complex queries on massive data sets.

Background

Xcalar supports the following fundamental JOIN types:

  • Inner Join: Returns records that have matching values in both tables.
  • Left Outer Join: Returns all the records from the left table and only the matching records from the right table.
  • Right Outer Join: Returns all the records from the right table and only the matching records from the left table.
  • Full Outer Join: Returns all the records from both tables, whether the tables match or do not match.
NOTE: FNF (Field Not Found) values are inserted for all outer joins that contain non-matching columns.

Implementing an EXISTS and NOT EXISTS subquery with JOINs

In a SQL database query, a correlated subquery is a subquery that uses values from the outer query. The subquery is evaluated once for each row processed by the outer query.

EXISTS clause in a Correlated Subquery

In Xcalar Design the SQL subquery EXISTS condition is met with a combination of group by and inner join statements.

Consider a sample SQL query containing an EXISTS operation for tables AIRLINES and CARRIERS. The subquery returns TRUE if at least one row is returned.

Example 1

SELECT
Carrier, FlightNum, TailNum, MonthdayYear, DayOfWeek, DepTime, ArrTime
FROM
AIRLINES
WHERE EXISTS (
SELECT *
FROM CARRIERS
WHERE carriers.Carrier = airlines.Carrier
and Description like “%Airline%”);

In Example 1 the EXISTS subquery tests each row in the AIRLINES table for the existence of a match in the CARRIER table.

Implementing Example 1 with Xcalar Design

It is assumed that you have an Xcalar Design worksheet containing airlines and carriers tables.

  1. Log in to Xcalar Design and open the worksheet that contains the airlines and carriers tables.
  2. Figure 1: Xcalar Design worksheet containing the airlines and carriers tables

  3. Right-click on the carriers subquery table heading, select Filter and in the FILTER panel create a like filter with an *Airline* matching pattern.
  4. Figure 2: Carriers table Filter operation

  5. Right-click on the carriers subquery table, select Group By and create a GROUP BY statement on the carrier column of the subquery, which makes the join unique to avoid cartesian products. Enter a new column name and a new table name in the appropriate fields. In this case, a Carrier_count column is created in the carriers_carrier_grpby table.
  6. NOTE: If the joining column is unique by design there is no need to create a group by operation.

    Figure 3: Carriers table Group By operation

  7. Right-click on the column in the new table that you want to join in your main table, select Join and then from the Join Type list, select Inner Join. In the Left and Right Table lists and the Left and Right Columns fields select or enter the tables and columns to join.
  8. Figure 4: Join operation

  9. Click NEXT and select the required columns that are to be displayed. Enter a new table name and complete the join operation by clicking JOIN TABLES.
  10. Figure 5 shows the resultant query table created from a Xcalar EXISTS operation.

    Figure 5: Query Exists table

NOT EXISTS clause in a Correlated Subquery

In Xcalar Design the SQL subquery NOT EXISTS condition is met with a combination of group by and left outer join statements and the FNF filter operation.

Consider a sample SQL query containing a NOT EXISTS operation, which finds the details of airlines who are not part of the list of carriers from the subquery.

Example 2

SELECT
Carrier, FlightNum, TailNum, MonthdayYear, DayOfWeek, DepTime, ArrTime
FROM
AIRLINES
WHERE NOT EXISTS (
SELECT *
FROM CARRIERS
WHERE carriers.carrier = airlines.Carrier
and description like “%Inc%”);

In Example 2 the NOT EXISTS subquery tests each row in the AIRLINES table for the non existence of a match in the CARRIER table.

Implementing Example 2 with Xcalar Design

It is assumed that you have an Xcalar Design worksheet containing airlines and carriers tables.

  1. Log in to Xcalar Design and open the worksheet that contains the airlines and carriers tables.
  2. Figure 6: Xcalar Design worksheet containing the airlines and carriers tables

  3. Right-click on the carriers subquery table heading, select Filter and in the FILTER panel create a like filter with an *Inc* matching pattern.
  4. Figure 7: Carriers table Filter operation

  5. Right-click on the carriers subquery table, select Group By and create a GROUP BY statement on the carrier column of the subquery, which makes the join unique to avoid cartesian products. Enter a new column name and a new table name in the appropriate fields. In this case, a Carrier_count column is created in the carriers_carrier_grpby table.
  6. NOTE: If the joining column is unique by design there is no need to create a group by operation.

    Figure 8: Carriers table Group By operation

  7. Right-click on the column in the new table that you want to join in your main table, select Join and then from the Join Type list, select Left Outer Join . In the Left and Right Table lists and the Left and Right Columns fields select or enter the tables and columns to join.
  8. Figure 9: Join operation first page

  9. Click NEXT.
  10. Select at least one column from right table, enter a new table name, and complete the join operation by clicking JOIN TABLES.
  11. Figure 10: Join operation second page

  12. In the new table Query_notexist_output, right-click on the column selected from right side join table (in this case the Carrier column) containing FNF (Field Not Found) values and select Filter this value. This results in the column Carrier containing only FNF values.
  13. Figure 11: FNF Filter operation

    Figure 12 shows the resultant query table created from a Xcalar NOT EXISTS operation.

    Figure 12: Query Not Exists table

 

Go to top