Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

1 votes

0 answers

113 views

How to pass multiple values into hive hql for the same hivevar

Requirement : My Hql has below script in which I want to pass values into the where clause dynamically. How do I dynamically pass using hivevar in below specific scenario where multiple values are expected. Or how do I invoke the hql with hivevar defined for (‘a’,’c’) Create table newtbl As select *...

                                  Requirement :
My Hql has below script in which I want to pass values into the where clause dynamically. How do I dynamically pass using hivevar in below specific scenario where multiple values are expected. Or how do I invoke the hql with hivevar defined for (‘a’,’c’)

Create table newtbl 
As select * from temptbl where
id IN (‘a’, ‘c’)

RaCh (11 rep)

Dec 21, 2023, 12:38 AM

1 votes

1 answers

308 views

For each tuple, get the name of the first column which is non-zero

hive hiveql

I have a table in Hive which looks like: ``` | Name | 1990 | 1991 | 1992 | 1993 | 1994 | | Rex | 0 | 0 | 1 | 1 | 1 | | Max | 0 | 0 | 0 | 0 | 1 | | Phil | 1 | 1 | 1 | 1 | 1 | ``` I would like to get, for each row, the name of the first column which is non-zero, so something like: ``` | Name | Column...

I have a table in Hive which looks like:

| Name | 1990 | 1991 | 1992 | 1993 | 1994 |
| Rex  | 0    | 0    | 1    | 1    | 1    |
| Max  | 0    | 0    | 0    | 0    | 1    |
| Phil | 1    | 1    | 1    | 1    | 1    |

I would like to get, for each row, the name of the first column which is non-zero, so something like:

| Name | Column |
| Rex  | 1992   |
| Max  | 1994   |
| Phil | 1990   |

For each row, it is guaranteed that: * There is at least one column with "1"; and * If column X has is "1", for each column Y > X, column Y will also have a "1".

user2891462 (113 rep)

Nov 28, 2021, 04:47 PM • Last activity: Dec 1, 2021, 05:21 PM

1 votes

0 answers

286 views

Container is running beyond physical memory and killed on request (code 143)

query-performance query memory aws hiveql

I ran a query which involved a `JOIN` and an `AVG` operation: ```` SELECT AVG(user_amount) AS average_user_per_game FROM ( SELECT start_time ,end_time ,COUNT(DISTINCT ID) AS user_amount FROM ( SELECT start_time ,from_unixtime(unix_timestamp(start_time) + 3600 * 4) AS end_time FROM table1 WHERE ) a J...

I ran a query which involved a JOIN and an AVG operation:

`
SELECT
  AVG(user_amount) AS average_user_per_game 
FROM 
  (
    SELECT 
      start_time
     ,end_time
     ,COUNT(DISTINCT ID) AS user_amount 
   FROM 
     (
       SELECT 
         start_time
        ,from_unixtime(unix_timestamp(start_time) + 3600 * 4) AS end_time 
       FROM 
         table1 
       WHERE
         
     ) a
   JOIN
     (
       SELECT 
         session_start_date_time
        ,ID 
       FROM 
         table2 
       WHERE 
         app_id = 'SOMETHING'
     ) b
       ON b.session_start_date_time >= a.start_time
           AND b.session_start_date_time <= a.end_time
    GROUP BY
      start_time
     ,end_time
) users

` Which returned the following:

`
Status: Failed
Application application_XXXXXXX failed 2 times due to AM Container for appattempt_XXXXXXX exited with  exitCode: -104
Failing this attempt.Diagnostics: [2021-08-17 17:41:17.384]Container [pid=XXXX,containerID=container_XXXX] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 2.8 GB of 5 GB virtual memory used. Killing container.
.
.
.
.
.
.
.
.
[2021-08-17 17:41:17.397]Container killed on request. Exit code is 143
[2021-08-17 17:41:17.407]Container exited with a non-zero exit code 143. 
For more detailed output, check the application tracking page: SOME URL Then click on links to logs of each attempt.
. Failing the application.

` What should I modify to let it run without errors?

Memphis Meng (111 rep)

Aug 17, 2021, 05:54 PM • Last activity: Aug 17, 2021, 08:58 PM

0 votes

0 answers

25 views

Field with Top Ranking Field Name

rank hadoop hive hiveql

Let's imagine a table structured like this: | Bucket | Red | Blue | Green | | ------ | --- |----- | ----- | | First | 1 |3 |4 | | Second | 6 |5 |2 | What I'm trying to achieve is based on the values within each bucket, I'd like to generate another set of fields with the highest ranking, second highe...

                                  Let's imagine a table structured like this:

| Bucket | Red | Blue | Green |
| ------ | --- |----- | ----- |
| First  | 1   |3     |4      |
| Second | 6   |5     |2      |

What I'm trying to achieve is based on the values within each bucket, I'd like to generate another set of fields with the highest ranking, second highest ranking, and third highest ranking colors (assume there are more than three colors as well). We are limiting to top 3.

Essentially, what my final output should look like is this:

| Bucket | Red | Blue | Green | Rank 1 | Rank 2 | Rank 3 |
| ------ | --- |----- | ----- | ------ | ------ | ------ |
| First  | 1   |3     |4      | Green  | Blue   | Red    |
| Second | 6   |5     |2      | Red    | Blue   | Green  |

Hoping this isn't a redundant question.
                                

Franco Buhay (1 rep)

Feb 10, 2021, 07:37 PM

1 votes

0 answers

97 views

Can I assign an index to an exploded array?

json array hiveql

I'm doing a `LATERAL EXPLODE` on an array in Hive, is there a way of reliably assigning a row number based on the array element? It looks like calling `row_number()` on the results of the `LATERAL EXPLODE` works, but I don't know if that is dependable. It does seem to be in practice. We're doing thi...

                                  I'm doing a LATERAL EXPLODE on an array in Hive, is there a way of reliably assigning a row number based on the array element? It looks like calling row_number() on the results of the LATERAL EXPLODE works, but I don't know if that is dependable.

It does seem to be in practice. We're doing this now and have never encountered a case where the elements have been enumerated incorrectly.

PhilHibbs (539 rep)

May 5, 2020, 10:18 AM • Last activity: Jun 8, 2020, 11:57 AM

0 votes

1 answers

769 views

Can I MERGE INTO ... INSERT with a column list in Hive?

merge hiveql

Here's a good example of a MERGE statement: MERGE INTO target AS T USING source AS S ON T.ID = S.ID and T.tran_time = S.tran_time WHEN MATCHED UPDATE SET quantity = S.quantity WHEN MATCHED AND S.quantity IS NULL THEN DELETE WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.quantity, S.tran_time); When do...

                                  Here's a good example of a MERGE statement:

    MERGE INTO target AS T 
    USING source AS S
    ON T.ID = S.ID and T.tran_time = S.tran_time 
    WHEN MATCHED UPDATE SET quantity = S.quantity
    WHEN MATCHED AND S.quantity IS NULL THEN DELETE
    WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.quantity, S.tran_time);

When doing the INSERT, does it have to exactly match the structure of the target table, or can a column list be somehow be specified like in a simple INSERT statement? I don't like hard coding value lists that have to match the order of the columns in the table, I like my statements to be column order independent.
                                

PhilHibbs (539 rep)

Apr 30, 2020, 01:52 PM • Last activity: Apr 30, 2020, 07:06 PM

1 votes

0 answers

26 views

Defining external table on JSON with an @ sign in an element

json hive hiveql

I need to define a Hive external table onto a JSON file that has @ signs in its elements, e.g. { "data": { "@type": "person", "name": "Phil", "job": "Programmer" } } This works: create external table sandbox.test_table ( data STRUCT ) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' STORED...

                                  I need to define a Hive external table onto a JSON file that has @ signs in its elements, e.g.

    { "data": { "@type": "person", "name": "Phil", "job": "Programmer" } }

This works:

    create external table sandbox.test_table
    ( data STRUCT
    )
    ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
    STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
    OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
    LOCATION 's3a://bucket/DEV/data/raw/test/';

However this misses out the @type element, I've tried these:

    data STRUCT
    data STRUCT

Neither of them work. Any suggestions how I can do this, or do I need to preprocess the JSON to remove the @ from the element?

                                

PhilHibbs (539 rep)

Mar 25, 2020, 12:17 PM

Showing page 1 of 7 total questions