Database Administrators
Q&A for database professionals who wish to improve their database skills
Latest Questions
0
votes
3
answers
3751
views
Is it a good idea to join data from two different databases?
I am building an application that requires a lot of tables in one database and while the joining and aggregation of data is really nice and seamless, I am starting to wonder if I am creating too many tables in one database rather than organizing better by creating multiple databases. 1. Is this stan...
I am building an application that requires a lot of tables in one database and while the joining and aggregation of data is really nice and seamless, I am starting to wonder if I am creating too many tables in one database rather than organizing better by creating multiple databases.
1. Is this standard practice in enterprise level companies?
2. How do you usually join data from two different databases if its normal to do so?
3. Does this cause latency issues?
Any help or guidance would help,
jaffer_syed
(13 rep)
Sep 21, 2023, 11:22 PM
• Last activity: Mar 25, 2025, 05:30 PM
4
votes
2
answers
439
views
If LATERAL is optional for table-valued functions, then why does this query error without it?
# Setup ```sql CREATE TABLE persons ( person_id int not null, name TEXT ); INSERT INTO persons VALUES (1, 'Adam'), (2, 'Paul'), (3, 'Tye'), (4, 'Sarah'); CREATE TABLE json_to_parse ( person_id int not null, block json ); INSERT INTO json_to_parse VALUES (1, '{"size": "small", "love": "x"}'), (2, '{"...
# Setup
CREATE TABLE persons
(
person_id int not null,
name TEXT
);
INSERT INTO persons VALUES
(1, 'Adam'),
(2, 'Paul'),
(3, 'Tye'),
(4, 'Sarah');
CREATE TABLE json_to_parse
(
person_id int not null,
block json
);
INSERT INTO json_to_parse VALUES
(1, '{"size": "small", "love": "x"}'),
(2, '{"size": "medium", "love": "xx"}'),
(3, '{"size": "big", "love": "xxx"}');
# The Error
This runs without issue
SELECT
*
FROM
json_to_parse
CROSS JOIN LATERAL
json_to_record(json_to_parse.block) AS my_json(size TEXT, love TEXT)
INNER JOIN
persons
ON
persons.person_id = json_to_parse.person_id;
but this does not
SELECT
*
FROM
json_to_parse,
json_to_record(json_to_parse.block) AS my_json(size TEXT, love TEXT)
INNER JOIN
persons
ON
persons.person_id = json_to_parse.person_id;
and I get the error "invalid reference to FROM-clause entry for table "json_to_parse""
Why does this second query error? [The docs](https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-LATERAL) make it quite clear that LATERAL
is optional for table-valued functions
> Table functions appearing in FROM
can also be preceded by the key word LATERAL
, but for functions the key word is optional; the function's arguments can contain references to columns provided by preceding FROM
items in any case.
[dbfiddle](https://dbfiddle.uk/An86bWru)
J. Mini
(1237 rep)
Dec 1, 2024, 03:30 PM
• Last activity: Dec 2, 2024, 12:00 AM
0
votes
0
answers
223
views
PostgreSQL query optimization over multiple tables with moderately large data
We have a data structure in PostgreSQL 16 which is moderately complex, I have a simplified diagram showing the join columns here: [![enter image description here][1]][1] The 3 SURVEY tables on the top left are exceptions because they are not joined with foreign key, rather we have a SURVEY_ID column...
We have a data structure in PostgreSQL 16 which is moderately complex, I have a simplified diagram showing the join columns here:
The 3 SURVEY tables on the top left are exceptions because they are not joined with foreign key, rather we have a SURVEY_ID column which can point to the ID of any of this 3 SURVEY tables (this is due to historic reasons, another column in the SURVEY_INPUT table decides which of these 3 to use). SURVEY_INPUT table also has the same SURVEY_ID column, which is not present here.
Here are the row counts:
- SURVEY_INPUT: 160K
- SURVEY_INPUT_RELATED_ID: 110K
- SURVEY_INPUT_RELATED_DATE: 68K
- SURVEY_INPUT_RELATED_CHARACTERISTICS: 355K
- SURVEY_INPUT_RELATED_PARTY: 400K
- SURVEY_RESPONSE: 29K
- SURVEY_RESPONSE_DATA: 76K
When these tables were designed there were no such requirements, but now we are looking at some specific reports which are ungodly slow. It must be filtered by the highlighted columns. At the moment we have maybe 1 week's data in these tables and the report can take 10 minutes to load, and we need to support 3 month's data... What I'm basically looking for as step 1 is to whether focus on the application side queries or the database performance... In our original implementation some entity relationships were eager in JPA, some were lazy, which resulted in lots and lots of simple and quick queries being launched, which ultimately resulted in the long execution. Now the application developer has did some magic, set everything to be eager and now JPA created a monster query to get all the data with all the filtering in a single step, as a huge cross join. This is now faster, but still it can take anywhere from 3 to 6 minutes, depending on selected time range. I tried to work on this large query as well, modified it a bit and added some indexes on the highlighted colors as well as all join columns, but had minimal impact. Here is a link to my current progress on the query, this version with these parameters took 00:02:59.990 to run and resulted in 209450 rows:
https://explain.depesz.com/s/ROVz
It says the sorting in step 6 is using disk space due to not enough work mem, but even if I remove the ORDER BY completely the query is the same speed, or even a few seconds slower.

Gábor Major
(163 rep)
Jun 6, 2024, 02:05 PM
0
votes
1
answers
97
views
Is it possible to sort rows by values from generate_series()?
I multiply result rows of a `SELECT` with `CROSS JOIN generate_series(1, max)`. My query is like: ~~~pgsql select id, name from person CROSS JOIN generate_series(1, 4) where ; ~~~ I have this result: id|name -|- 1|name1 1|name1 1|name1 1|name1 2|name2 2|name2 2|name2 2|name2 3|name3 3|name3 3|name3...
I multiply result rows of a
SELECT
with CROSS JOIN generate_series(1, max)
. My query is like:
~~~pgsql
select id, name
from person
CROSS JOIN generate_series(1, 4)
where ;
~~~
I have this result:
id|name
-|-
1|name1
1|name1
1|name1
1|name1
2|name2
2|name2
2|name2
2|name2
3|name3
3|name3
3|name3
3|name3
I want the result like this:
id|name
-|-
1|name1
2|name2
3|name3
1|name1
2|name2
3|name3
1|name1
2|name2
3|name3
1|name1
2|name2
3|name3
Is that possible?
nanocellule
(3 rep)
May 1, 2023, 09:13 PM
• Last activity: May 2, 2023, 05:24 PM
0
votes
0
answers
74
views
Joining Two Tables via JSONB column (date)
Given the following data: ``` create table datas (id int, data jsonb); insert into categories (id, budget) values (1, '{"2022-07-01": {"budget_amount": value1, "some": "thing1"},"2022-08-01": {"budget_amount": value2, "some": "thing2"}}'), (2, '{"2022-08-01": {"budget_amount": value3, "some": "thing...
Given the following data:
create table datas (id int, data jsonb);
insert into categories (id, budget)
values
(1, '{"2022-07-01": {"budget_amount": value1, "some": "thing1"},"2022-08-01": {"budget_amount": value2, "some": "thing2"}}'),
(2, '{"2022-08-01": {"budget_amount": value3, "some": "thing3"},"2022-08-01": {"budget_amount": value4, "some": "thing4"}}');
insert into transactions (id, transaction)
values
(1, '{"vendor","equalto.categories.id2",amount, date}'),
(2, '{"vendor","equalto.categories.id1",amount, date}'),
(3, '{"vendor","equalto.categories.id3",amount, date}'),
(4, '{"vendor","equalto.categories.id2",amount, date}');
|id|budget|
|--|------|
|1|{"2022-07-01": {"budget_amount": value1, "some": "thing1"},"2022-08-01": {"budget_amount": value2, "some": "thing2"}}|
|2|{"2022-08-01": {"budget_amount": value3, "some": "thing3"},"2022-08-01": {"budget_amount": value4, "some": "thing4"}}|
|id|vendor|category_id|amount|date|
|-|-|-|-|-|
|1|vendor|equalto.categories.id1|amount|date|
|2|vendor|equalto.categories.id2|amount|date|
|3|vendor|equalto.categories.id3|amount|date|
|4|vendor|equalto.categories.id2|amount|date|
For a particular category I'm looking to do something like the following:
SELECT categories_id, DATE_TRUNC('month',transaction.date) as tmonth, SUM(transactions.amount),
jsonb_path_query(budget.data, '$.*.budget_amount') #>> '{}' as budget_amount
from categories, transactions
JOIN tmonth on jsonb_path_query(budget.data, '$.keyvalue()') ->> 'key'
|Category ID|Month|budget_amount|spend|
|:--------------:|:------------:|:------------:|:------------:|
|categories.id|DATE_TRUNC('month',transaction.date)|budget_amount|sum(transactions.amount)|
Essentially I'm looking to join the top level key in datas (first of the month in YYYY-MM-DD format) to the column date (timestamp) in the transaction table. The goal being to get per row the budget.id, month as YYYY-MM-01, budget, sum of monthly transactions for that budget.id
Issue being that I can't figure out how to join the two tables on a jsonb column. Could someone please recommend a way to combine this data? Any help would be appriciated.
RMcLellan
(43 rep)
Feb 9, 2023, 06:38 PM
2
votes
1
answers
280
views
SQL Server JOIN vs sparse table
I am in the need to query a relation between an entity called article (it's a product sold in a shop) and it's characterstics (a set of attributes). The result will always presented in a paged manner (so only few articles record are read each time). The goal of the query is to fetch a page of articl...
I am in the need to query a relation between an entity called article (it's a product sold in a shop) and it's characterstics (a set of attributes). The result will always presented in a paged manner (so only few articles record are read each time).
The goal of the query is to fetch a page of articles that have a set of characteristics (a few dozen per query).
My first idea is to use a join among those two entities, I already have a table representing articles, a table representing characteristics, and a cross-join table that map the relations of the previous two.
Problem is that articles may be many(up to 1 million), while characteristics may be quite numerous (a few hundred). That does not seem a big deal, but I am afraid because distribution is not uniform, for example, may happen that some characteristics are common to almost all articles (while other may be barely used). Another aspect is that both article and characteristics tables have few json fields that may be quite big (a few KB of data each).
I am still in the design phase, but I am asking to myself if a join is the best approach in this scenario or if it's cheaper to build an auxiliary "index table" (with only a searchable field, so without json field) to use for the query.
I mean programmatically build/rebuild an auxiliary table, do one query over that table to fetch the article IDs of items in the "current page" and then do another query over article tables (limited to that page for that pool of IDs).
To achieve that I thought about sparse table, I mean a sparse table could have a few hundred columns, so maybe I can create/recreate (periodically) such a search table in which I have one row for each article, and one column for each characteristic (each characteristic column is just a bit, and I have application constraint so I am guaranteed that the characteristics are less than one thousand).
Given a specific convention (eg: characteristic with ID 345 will become column 'C345' in the index table) I can create dynamically a query over such a sparse table. In that way, I have a guarantee that this "index table" will have only 1 million records in the worst-case scenario, while the join may have hundred of millions if I am particularly unlucky.
That said, I have never used sparse tables like that before, and I am not sure they perform well, so I would ask if this is a good idea accordingly to your experience.
Here is the schema of tables I already have (articles, characteristics, and the cross-join table) :
/****** Object: Table [dbo].[tbl_ana_Articles] Script Date: 1/30/2023 6:37:55 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[tbl_ana_Articles](
[ID] [int] IDENTITY(1,1) NOT NULL,
[CODE1] [nvarchar](10) NOT NULL, --optional searchable codes
[CODE2] [nvarchar](10) NOT NULL, --optional searchable codes
[CODE3] [nvarchar](10) NOT NULL, --optional searchable codes
[CODE4] [nvarchar](10) NOT NULL, --optional searchable codes
[_JSON1] [nvarchar](max) NOT NULL, --json fields, quite big (few KB each) not searchable
[_JSON2] [nvarchar](max) NOT NULL, --json fields, quite big (few KB each) not searchable
[_JSON3] [nvarchar](max) NOT NULL, --json fields, quite big (few KB each) not searchable
CONSTRAINT [PK_tbl_ana_articles] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
/****** Object: Table [dbo].[tbl_ana_Characteristics] Script Date: 1/30/2023 6:37:55 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[tbl_ana_Characteristics](
[ID] [int] IDENTITY(1,1) NOT NULL,
[ID_FILTER] [int] NOT NULL, --characteristics are grouped by "kind" eg : the kind "size" may contain the set of characteristic {"big","medium","small"} this ID keep track of the one to many association (not relevant in this scenario)
[CODE1] [nvarchar](10) NOT NULL, --optional searchable codes
[CODE2] [nvarchar](10) NOT NULL, --optional searchable codes
[_JSON1] [nvarchar](max) NOT NULL, --json fields, quite big (few KB each) not searchable
[_JSON2] [nvarchar](max) NOT NULL, --json fields, quite big (few KB each) not searchable
CONSTRAINT [PK_tbl_ana_characteristics] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
/****** Object: Table [dbo].[tbl_src_ArticlesCharacteristics] Script Date: 1/30/2023 6:37:55 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[tbl_src_ArticlesCharacteristics](
[ID_CHARACTERISTIC] [int] NOT NULL,
[ID_ARTICLE] [int] NOT NULL,
[ID_FILTER] [int] NOT NULL,
CONSTRAINT [PK_tbl_src_ArticlesCharacteristics] PRIMARY KEY CLUSTERED
(
[ID_CHARACTERISTIC] ASC,
[ID_ARTICLE] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
Here how i build the sparse table for the lookup (i take up to 10K column just for testing pourpose)
/****** Object: Table [dbo].[ArticleCarat] Script Date: 1/30/2023 6:53:11 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[ArticleCarat](
[ArtID] [int] NOT NULL,
[CARAT_1] [bit] SPARSE NULL,
[CARAT_2] [bit] SPARSE NULL,
[CARAT_3] [bit] SPARSE NULL,
[CARAT_4] [bit] SPARSE NULL,
...
..
.
[CARAT_10132] [bit] SPARSE NULL,
[CARAT_10133] [bit] SPARSE NULL,
[CARAT_10134] [bit] SPARSE NULL,
[SpecialPurposeColumns] [xml] COLUMN_SET FOR ALL_SPARSE_COLUMNS NULL,
PRIMARY KEY CLUSTERED
(
[ArtID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
**---Adding more context :**
As requested I have provided more context here. The article table will hold data about items that can be purchased in a shop. So entries may be shoes, clothes, or another thing. Each thing has many codes (the meaning is configurable in the installation phase) used during search and has attached few not searchable metadata (the json fields, those data include descriptions in many languages and pricing information).
The last thing about codes is that they should be unique, but I really don't know (they are provided by the customer and sometimes data is messy, showing two articles with the same code is not an issue).
Json fields are not important here, I mentioned because seems to me that they will affect search performance even when not in the select statement.
For that reason, I rarely do a search over the article table but use a special table for searches over codes.
That said there are Characteristics, that are searchable attributes of the article. For example, shoes may have the following characteristics "blue - 42EU - running". Characteristics are more like tags here in StackOverflow.
Here is a dataset for example (keep in mind data and DB schema reflect my real case, but are fictional, due to company policy I can't disclose real data) :
https://dbfiddle.uk/aBnv_GK2
Here the code to generate the example, i have integrated information also about a filter table, filters are just a set of characteristics related to the same topic.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[tbl_ana_Articles](
[ID] [int] IDENTITY(1,1) NOT NULL,
[CODE1] [nvarchar](10) NOT NULL,
[CODE2] [nvarchar](10) NOT NULL,
[CODE3] [nvarchar](10) NOT NULL,
[CODE4] [nvarchar](10) NOT NULL,
[_JSON1] [nvarchar](max) NOT NULL,
[_JSON2] [nvarchar](max) NOT NULL,
[_JSON3] [nvarchar](max) NOT NULL,
CONSTRAINT [PK_tbl_ana_articles] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[tbl_ana_Characteristics](
[ID] [int] IDENTITY(1,1) NOT NULL,
[ID_FILTER] [int] NOT NULL,
[CODE1] [nvarchar](10) NOT NULL,
[CODE2] [nvarchar](10) NOT NULL,
[_JSON1] [nvarchar](max) NOT NULL,
[_JSON2] [nvarchar](max) NOT NULL,
CONSTRAINT [PK_tbl_ana_characteristics] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[tbl_ana_Filter](
[ID] [int] IDENTITY(1,1) NOT NULL,
[CODE] [nvarchar](20) NOT NULL,
CONSTRAINT [PK_tbl_ana_filter] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[tbl_src_ArticlesCharacteristics](
[ID_CHARACTERISTIC] [int] NOT NULL,
[ID_ARTICLE] [int] NOT NULL,
[ID_FILTER] [int] NOT NULL,
CONSTRAINT [PK_tbl_src_ArticlesCharacteristics] PRIMARY KEY CLUSTERED
(
[ID_CHARACTERISTIC] ASC,
[ID_ARTICLE] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
SET IDENTITY_INSERT [dbo].[tbl_ana_Articles] ON
GO
INSERT [dbo].[tbl_ana_Articles] ([ID], [CODE1], [CODE2], [CODE3], [CODE4], [_JSON1], [_JSON2], [_JSON3]) VALUES (2, N'C0001', N'EAN13_1', N'BATCH_1', N'HU_1', N'{ "description":"shoes for running"}', N'{"price" : "40€"}', N'{"other" : "other"}')
GO
INSERT [dbo].[tbl_ana_Articles] ([ID], [CODE1], [CODE2], [CODE3], [CODE4], [_JSON1], [_JSON2], [_JSON3]) VALUES (3, N'C0002', N'EAN13_2', N'BATCH_2', N'HU_2', N'{ "description":"t-shirt"}', N'{"price" : "10€"}', N'{"other" : "other"}')
GO
INSERT [dbo].[tbl_ana_Articles] ([ID], [CODE1], [CODE2], [CODE3], [CODE4], [_JSON1], [_JSON2], [_JSON3]) VALUES (4, N'C0003', N'EAN13_3', N'BATCH_3', N'HU_3', N'{ "description":"sun glasses"}', N'{"price" : "15€"}', N'{"other" : "other"}')
GO
SET IDENTITY_INSERT [dbo].[tbl_ana_Articles] OFF
GO
SET IDENTITY_INSERT [dbo].[tbl_ana_Characteristics] ON
GO
INSERT [dbo].[tbl_ana_Characteristics] ([ID], [ID_FILTER], [CODE1], [CODE2], [_JSON1], [_JSON2]) VALUES (1, 1, N'Size 38 EU', N'-nd-', N'-nd-', N'-nd-')
GO
INSERT [dbo].[tbl_ana_Characteristics] ([ID], [ID_FILTER], [CODE1], [CODE2], [_JSON1], [_JSON2]) VALUES (2, 1, N'Size 39 EU', N'-nd-', N'-nd-', N'-nd-')
GO
INSERT [dbo].[tbl_ana_Characteristics] ([ID], [ID_FILTER], [CODE1], [CODE2], [_JSON1], [_JSON2]) VALUES (3, 1, N'Size 40 EU', N'-nd-', N'-nd-', N'-nd-')
GO
INSERT [dbo].[tbl_ana_Characteristics] ([ID], [ID_FILTER], [CODE1], [CODE2], [_JSON1], [_JSON2]) VALUES (4, 1, N'Size 41 EU', N'-nd-', N'-nd-', N'-nd-')
GO
INSERT [dbo].[tbl_ana_Characteristics] ([ID], [ID_FILTER], [CODE1], [CODE2], [_JSON1], [_JSON2]) VALUES (5, 1, N'Size 42 EU', N'-nd-', N'-nd-', N'-nd-')
GO
INSERT [dbo].[tbl_ana_Characteristics] ([ID], [ID_FILTER], [CODE1], [CODE2], [_JSON1], [_JSON2]) VALUES (6, 1, N'Size 43 EU', N'-nd-', N'-nd-', N'-nd-')
GO
INSERT [dbo].[tbl_ana_Characteristics] ([ID], [ID_FILTER], [CODE1], [CODE2], [_JSON1], [_JSON2]) VALUES (7, 1, N'Size 44 EU', N'-nd-', N'-nd-', N'-nd-')
GO
INSERT [dbo].[tbl_ana_Characteristics] ([ID], [ID_FILTER], [CODE1], [CODE2], [_JSON1], [_JSON2]) VALUES (9, 2, N'Classic', N'-nd-', N'-nd-', N'-nd-')
GO
INSERT [dbo].[tbl_ana_Characteristics] ([ID], [ID_FILTER], [CODE1], [CODE2], [_JSON1], [_JSON2]) VALUES (10, 2, N'Running', N'-nd-', N'-nd-', N'-nd-')
GO
INSERT [dbo].[tbl_ana_Characteristics] ([ID], [ID_FILTER], [CODE1], [CODE2], [_JSON1], [_JSON2]) VALUES (11, 2, N'Trekking', N'-nd-', N'-nd-', N'-nd-')
GO
INSERT [dbo].[tbl_ana_Characteristics] ([ID], [ID_FILTER], [CODE1], [CODE2], [_JSON1], [_JSON2]) VALUES (12, 3, N'Red', N'-nd-', N'-nd-', N'-nd-')
GO
INSERT [dbo].[tbl_ana_Characteristics] ([ID], [ID_FILTER], [CODE1], [CODE2], [_JSON1], [_JSON2]) VALUES (13, 3, N'Blue', N'-nd-', N'-nd-', N'-nd-')
GO
INSERT [dbo].[tbl_ana_Characteristics] ([ID], [ID_FILTER], [CODE1], [CODE2], [_JSON1], [_JSON2]) VALUES (14, 3, N'Green', N'-nd-', N'-nd-', N'-nd-')
GO
SET IDENTITY_INSERT [dbo].[tbl_ana_Characteristics] OFF
GO
SET IDENTITY_INSERT [dbo].[tbl_ana_Filter] ON
GO
INSERT [dbo].[tbl_ana_Filter] ([ID], [CODE]) VALUES (1, N'Size')
GO
INSERT [dbo].[tbl_ana_Filter] ([ID], [CODE]) VALUES (2, N'Pourpose')
GO
INSERT [dbo].[tbl_ana_Filter] ([ID], [CODE]) VALUES (3, N'Color')
GO
SET IDENTITY_INSERT [dbo].[tbl_ana_Filter] OFF
GO
INSERT [dbo].[tbl_src_ArticlesCharacteristics] ([ID_CHARACTERISTIC], [ID_ARTICLE], [ID_FILTER]) VALUES (3, 2, 1)
GO
INSERT [dbo].[tbl_src_ArticlesCharacteristics] ([ID_CHARACTERISTIC], [ID_ARTICLE], [ID_FILTER]) VALUES (9, 3, 2)
GO
INSERT [dbo].[tbl_src_ArticlesCharacteristics] ([ID_CHARACTERISTIC], [ID_ARTICLE], [ID_FILTER]) VALUES (10, 2, 2)
GO
INSERT [dbo].[tbl_src_ArticlesCharacteristics] ([ID_CHARACTERISTIC], [ID_ARTICLE], [ID_FILTER]) VALUES (12, 4, 3)
GO
INSERT [dbo].[tbl_src_ArticlesCharacteristics] ([ID_CHARACTERISTIC], [ID_ARTICLE], [ID_FILTER]) VALUES (13, 2, 3)
GO
INSERT [dbo].[tbl_src_ArticlesCharacteristics] ([ID_CHARACTERISTIC], [ID_ARTICLE], [ID_FILTER]) VALUES (13, 3, 3)
GO
Skary
(368 rep)
Jan 30, 2023, 05:52 PM
• Last activity: Feb 1, 2023, 08:58 AM
0
votes
0
answers
92
views
How to show all the columns data in full outer join
I have 2 tables. [![Table 1][1]][1] [![Table 2][2]][2] The Result is [![Output][3]][3] [1]: https://i.sstatic.net/ddRpC.png [2]: https://i.sstatic.net/XcIBY.png [3]: https://i.sstatic.net/0kc04.png Issue is EnteredBy=23 is not displaying because I am using CT1.EnteredBy. How to display CT11.EnteredB...
I have 2 tables.
The Result is
Issue is EnteredBy=23 is not displaying because I am using CT1.EnteredBy.
How to display CT11.EnteredBy value instead of NULL
The Actual Query I am using is given below
With CT1 As(
Select C1.EnteredBy,CodingTimeInMin
=Sum(ROUND(CAST(C1.CODINGTIME AS FLOAT)/(CAST(60 AS FLOAT)),4))
from [dbo].[CodedDatas] C1
INNER JOIN
(Select Document_Id,Max(LastModified) As LastModified
from [DBO].[CODEDDATAS]
Group By Document_Id) C2
on C1.Document_ID=C2.Document_ID
And C1.Coded=1 And C1.Revision=0
--And C1.LastModified=C2.LastModified
group by C1.EnteredBy
),
CT11 As(
Select C1.EnteredBy,ReviewedTimeInMin=
Sum(ROUND(CAST(C1.QATime AS FLOAT)/(CAST(60 AS FLOAT)),4))
from [dbo].[CodedDatas] C1
INNER JOIN
(Select Document_Id,Max(LastModified) As LastModified
from [DBO].[CODEDDATAS]
Group By Document_Id) C2
on C1.Document_ID=C2.Document_ID
And C1.Coded=1 And C1.Revision=1
And C1.LastModified=C2.LastModified
group by C1.EnteredBy
),
CT2 As(
Select CodingTimeInHR=
ROUND(CT1.CodingTimeInMin/CAST(60 AS FLOAT), 4) ,
ReviewedTimeInHR=
ROUND(CT11.ReviewedTimeInMin/CAST(60 AS FLOAT), 4) ,
EnteredBy=CT1.EnteredBy
from CT1
full outer join CT11 on CT1.EnteredBy=CT11.EnteredBy
)
select * from CT2;



Diya Rawat
(23 rep)
Dec 9, 2022, 04:27 AM
• Last activity: Dec 9, 2022, 10:17 AM
0
votes
1
answers
45
views
Using the correct JOIN
I am trying to create an app for the salespeople who work for my company to keep track of their sales/commission When a salesperson brings on an account, their name is noted on the account and the table might look like Table B | Account | Salesperson | |:---------:|:-----------:| | Account A | A | |...
I am trying to create an app for the salespeople who work for my company to keep track of their sales/commission
When a salesperson brings on an account, their name is noted on the account and the table might look like
Table B
| Account | Salesperson |
|:---------:|:-----------:|
| Account A | A |
| Account B | B |
| Account C | B |
| Account D | A |
Then the jobs list might look like
Table A
| Account | Job details | Price |
|:---------:|:-----------:|:-----:|
| Account A | xyz | £x |
| Account B | xyz | £x |
| Account C | xyz | £x |
| Account D | xyz | £x |
| Account A | xyz | £x |
| Account C | xyz | £x |
I am looking for a way to filter the results of table A based on the sales person noted next to the Account in Table B
I have looked at all of the
JOIN
options but am unsure I am using the correct one.
I have opted, currently, for
SELECT J.*, A.Company, A.SalesPerson FROM TableA A
INNER JOIN TableB B
ON A.Account= B.Account
Where A.Salesperson = "B"
This seems to work for the time being, however I am unsure wether this is the best Join to use and whether it will last the test of time
PaulMcF87
(177 rep)
Nov 29, 2022, 03:50 PM
• Last activity: Nov 29, 2022, 11:45 PM
0
votes
1
answers
313
views
Doesn't T-SQL support correlated CROSS JOINs?
This query works SELECT QMC.HAUSKEY, t1.STRNAME, t2.HAUS_NR FROM SWOPS.MIGR.QMCAddresses QMC CROSS JOIN (SELECT STRNAME FROM SWOPS.MIGR.EB_DICT_STREET_QMC WHERE FK=8055909) t1 CROSS JOIN (SELECT HAUS_NR FROM SWOPS.MIGR.EB_DICT_HAUSNR_QMC WHERE FK=8055909) t2 WHERE 1=1 AND QMC.HAUSKEY=8055909 ORDER B...
This query works
SELECT QMC.HAUSKEY, t1.STRNAME, t2.HAUS_NR
FROM SWOPS.MIGR.QMCAddresses QMC
CROSS JOIN (SELECT STRNAME FROM SWOPS.MIGR.EB_DICT_STREET_QMC WHERE FK=8055909) t1
CROSS JOIN (SELECT HAUS_NR FROM SWOPS.MIGR.EB_DICT_HAUSNR_QMC WHERE FK=8055909) t2
WHERE 1=1
AND QMC.HAUSKEY=8055909
ORDER BY HAUSKEY, t1.STRNAME, t2.HAUS_NR
However, when I replace the constant 8055909 in the CROSS JOIN subqueries with the column reference QMC.HAUSKEY the query reports an error
`Msg 4104, Level 16, State 1, Line 56
The multi-part identifier "QMC.HAUSKEY" could not be bound.`
Why is that?
BACKGROUND:
QMCAddresses contains the buildings and their addresses
EB_DICT_STREET_QMC contains the possible variations of the street-names of the addresses in QMCAddresses
EB_DICT_STREET_HAUSNR contains the possible variations of the house-numbers of the addresses in QMCAddresses
What I am trying to do is to create a cartesian product for each building address consisting of all possible variations of its street-name and all possible variations of its house-number.
e.g.


Mediterrano
(1 rep)
Sep 7, 2022, 02:11 PM
• Last activity: Sep 7, 2022, 09:49 PM
0
votes
0
answers
54
views
Pivot table but with cross product
I' currently in the following situation: I have a `measurements` table that looks something like this: ``` +----+-----------+-------+---------+ | id | germplasm | trait | m_value | +----+-----------+-------+---------+ | 1 | g1 | t1 | 1 | | 2 | g1 | t1 | 2 | | 3 | g1 | t2 | 3 | +----+-----------+----...
I' currently in the following situation: I have a
measurements
table that looks something like this:
+----+-----------+-------+---------+
| id | germplasm | trait | m_value |
+----+-----------+-------+---------+
| 1 | g1 | t1 | 1 |
| 2 | g1 | t1 | 2 |
| 3 | g1 | t2 | 3 |
+----+-----------+-------+---------+
And I need to get something of this structure to be able to easily generate a [SPLOM plot](https://plotly.com/javascript/splom/) (multi-dimensional scatter plot):
+-----------+----+----+
| germplasm | t1 | t2 |
+-----------+----+----+
Now I know I can use a pivot query like this:
select id, germplasm, MAX(case when trait = 't1' then value end) as 't1', MAX(case when trait = 't2' then value end) as 't2' from measurements group by germplasm
Which gives me:
+-----------+----+----+
| germplasm | t1 | t2 |
+-----------+----+----+
| g1 | 2 | 3 |
+-----------+----+----+
However, the row with id 1
gets lost in this because of the MAX
. What I'd like to get is this result:
+-----------+----+----+
| germplasm | t1 | t2 |
+-----------+----+----+
| g1 | 1 | 3 |
| g1 | 2 | 3 |
+-----------+----+----+
So basically a cross-product between the trait values for the germplasm (but still with the trait names as columns). Note that not all traits will have numeric values, some are categorical.
If I add the id
to the group by
, I receive everything in it's own row which also isn't what I want.
I tried searching for "mysql pivot cross product" or "mysql pivot duplicate values", but couldn't find anything.
Does anyone have an idea how to achieve this? Obviously my actual scenario is more complex with additional columns in the measurements
table that are used in the group by
like treatment
, replicate
etc, but I should be able to deal with those at a later date.
Baz
(165 rep)
Jul 22, 2022, 08:54 AM
0
votes
0
answers
140
views
Is there a cross join in these CTEs?
I have a query that is constructed from several CTEs (in order to organize the query). I constructed it in my SQL workbench and was happy with the test results on a smaller dataset, so I converted it to SQLalchemy code for executing it on the production database. Now, SQLalchemy warns me of a cross...
I have a query that is constructed from several CTEs (in order to organize the query). I constructed it in my SQL workbench and was happy with the test results on a smaller dataset, so I converted it to SQLalchemy code for executing it on the production database. Now, SQLalchemy warns me of a cross join:
SELECT statement has a cartesian product between FROM element(s) "vm_per_bus" and FROM element "voltages". Apply join condition(s) between each element to resolve.
In my query (see below), this happens in the last CTE. My question is now twofold:
1. Is there really an accidential cartesian join?
2. How do I join the two CTEs?
This is the query:
WITH voltages AS
(SELECT muscle_actions.id AS id,
regexp_replace(CAST(experiment_runs.document_json['schedule_config']['phase_0']['environments']['environment']['params']['reward']['name'] AS TEXT), '^[^:]+:([^"]+)"', '\1') AS reward_function,
agents.uid AS agent,
jsonb_array_elements(jsonb_path_query_array(CAST(muscle_actions.sensor_readings AS JSONB), '$.*.value')) AS vm
FROM muscle_actions
JOIN agents ON agents.id = muscle_actions.agent_id
JOIN experiment_run_phases ON experiment_run_phases.id = agents.experiment_run_phase_id
JOIN experiment_run_instances ON experiment_run_instances.id = experiment_run_phases.experiment_run_instance_id
JOIN experiment_runs ON experiment_runs.id = experiment_run_instances.experiment_run_id
WHERE experiment_runs.id IN (8,
11,
19,
25,
29,
33,
37,
41,
45,
49,
53,
57,
61,
65,
69,
73,
77,
81,
85,
89,
93,
97,
120,
122,
137,
127,
132,
142,
147,
152,
157,
162,
199,
204,
209,
214,
219,
224,
229)
AND experiment_run_phases.uid = 'phase_1'),
vm_per_bus AS
(SELECT voltages.id AS id,
voltages.reward_function AS reward_function,
voltages.agent AS agent,
voltages.vm AS vm,
abs(1.0 - CAST(voltages.vm AS REAL)) AS vm_deviation,
abs(1.0 - CAST(voltages.vm AS REAL)) >= 0.15 AS vm_violation
FROM voltages),
vm AS
(SELECT vm_per_bus.id AS id,
max(voltages.reward_function) AS reward_function,
max(voltages.agent) AS agent,
array_agg(vm_per_bus.vm) AS array_agg_1,
max(vm_per_bus.vm_deviation) AS max_vm_deviation,
sum(CAST(vm_per_bus.vm_violation AS INTEGER)) AS num_vm_violations
FROM vm_per_bus,
voltages
GROUP BY vm_per_bus.id)
SELECT vm.reward_function,
vm.agent,
max(vm.max_vm_deviation) AS max_abs_violation,
sum(vm.num_vm_violations) AS num_violations
FROM vm
GROUP BY vm.reward_function,
vm.agent
Technaton
(171 rep)
May 27, 2022, 08:06 PM
3
votes
1
answers
2657
views
How do I create a SQL loop that generates future dates based on different frequencies and intervals for each item until a specified end date?
I have a data set that appears like this: next_generation_date | procedure | interval | frequency :-- | :-- | :-- | :-- 2021-01-17 00:00:00.000 | Clean Restroom | 1 | day 2021-01-17 00:00:00.000 | Vacuum | 2 | week 2021-02-01 00:00:00.000 | Inspect Fire Extinguisher | 3 | month 2021-10-01 00:00:00.0...
I have a data set that appears like this:
next_generation_date | procedure | interval | frequency
:-- | :-- | :-- | :--
2021-01-17 00:00:00.000 | Clean Restroom | 1 | day
2021-01-17 00:00:00.000 | Vacuum | 2 | week
2021-02-01 00:00:00.000 | Inspect Fire Extinguisher | 3 | month
2021-10-01 00:00:00.000 | Test Generator | 4 | year
My goal is to generate multiple date rows for each **procedure** by going off of the **next_generation_date**, **interval** and **frequency** columns until a specified end date. This specified end date would be the same date throughout the entire table. In this example, let's make the specified end date 2025-12-31. My end goal is for the table to appear similar to the below. Please note since this is an example, I didn't include every row as to avoid having hundreds of rows listed below.
next_generation_date | procedure | interval | frequency
:-- | :-- | :-- | :--
2021-01-17 00:00:00.000 | Clean Restroom | 1 | day
2021-01-18 00:00:00.000 | Clean Restroom | 1 | day
2021-01-19 00:00:00.000 | Clean Restroom | 1 | day
2021-01-20 00:00:00.000 | Clean Restroom | 1 | day
2021-01-17 00:00:00.000 | Vacuum | 2 | week
2021-01-31 00:00:00.000 | Vacuum | 2 | week
2021-02-14 00:00:00.000 | Vacuum | 2 | week
2021-02-28 00:00:00.000 | Vacuum | 2 | week
2021-02-01 00:00:00.000 | Inspect Fire Extinguisher | 3 | month
2021-05-01 00:00:00.000 | Inspect Fire Extinguisher | 3 | month
2021-08-01 00:00:00.000 | Inspect Fire Extinguisher | 3 | month
2021-11-01 00:00:00.000 | Inspect Fire Extinguisher | 3 | month
2021-10-01 00:00:00.000 | Test Generator | 4 | year
2025-10-01 00:00:00.000 | Test Generator | 4 | year
To summarize the above table, future "Clean Restroom" dates are each day, "Vacuum" dates are every two weeks, "Inspect Fire Extinguisher" dates are every three months, and "Test Generator" dates are every four years.
Below are two of my most recent attempts in Microsoft SQL Server Management Studio. The first attempt creates the loop but my procedures only increase by an interval of 1, not the unique interval found within the **interval** column. My second attempt generates the next **next_generation_date** with the correct interval but doesn't loop.
CREATE TABLE ##many_integers (idx INT);
WITH units(units) AS (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION
SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
)
,tens AS(SELECT units * 10 AS tens FROM units )
,hundreds AS(SELECT tens * 10 AS hundreds FROM tens )
,thousands AS(SELECT hundreds * 10 AS thousands FROM hundreds )
,tenthousands AS(SELECT thousands * 10 AS tenthousands FROM thousands )
,hundredthousands AS(SELECT tenthousands * 10 AS hundredthousands FROM tenthousands)
INSERT
INTO ##many_integers
SELECT hundredthousands +tenthousands +thousands +hundreds +tens +units
FROM units
CROSS JOIN tens
CROSS JOIN hundreds
CROSS JOIN thousands
CROSS JOIN tenthousands
CROSS JOIN hundredthousands ;
SELECT [procedure], [frequency], [interval],
CASE [frequency]
WHEN 'day' THEN DATEADD( day , idx , [next_generation_date])
WHEN 'week' THEN DATEADD( week , idx , [next_generation_date])
WHEN 'month' THEN DATEADD( month , idx , [next_generation_date])
WHEN 'year' THEN DATEADD( year , idx , [next_generation_date])
ELSE NULL
END AS [next_generation_date]
FROM [data].[fact] CROSS JOIN ##many_integers
WHERE idx < 5000
AND
CASE [frequency]
WHEN 'day' THEN DATEADD( day , idx , [next_generation_date])
WHEN 'week' THEN DATEADD( week , idx , [next_generation_date])
WHEN 'month' THEN DATEADD( month , idx , [next_generation_date])
WHEN 'year' THEN DATEADD( year , idx , [next_generation_date])
ELSE NULL
END <= '2023-12-31 00:00:00'
ORDER BY 1
;
SELECT [procedure], [frequency], [interval], [next_generation_date] FROM [data].[fact]
UNION ALL
SELECT [procedure], [frequency], [interval],
CASE
WHEN [frequency] = 'day' THEN DATEADD(day, [interval], [next_generation_date])
WHEN [frequency] = 'week' THEN DATEADD(week, [interval], [next_generation_date])
WHEN [frequency] = 'month' THEN DATEADD(month, [interval], [next_generation_date])
WHEN [frequency] = 'year' THEN DATEADD(year, [interval], [next_generation_date])
ELSE NULL
END AS [next_generation_date]
FROM [data].[fact]
ORDER BY 1;
Any and all suggestions are greatly appreciated as I'm new to SQL. Thank you.
Gulfhawk
(31 rep)
May 24, 2022, 07:45 PM
• Last activity: May 26, 2022, 10:13 AM
5
votes
2
answers
9650
views
Is there such a thing as a LEFT CROSS JOIN?
Is there a way to do what I want to call a "LEFT CROSS JOIN"? I.e., is there a way to specify a CROSS JOIN ON, but return at least one row in the result for every row in the left, using nulls for the right-side columns when the right-side table does not contain a match?
Is there a way to do what I want to call a "LEFT CROSS JOIN"?
I.e., is there a way to specify a CROSS JOIN ON, but return at least one row in the result for every row in the left, using nulls for the right-side columns when the right-side table does not contain a match?
Joshua Goldberg
(227 rep)
Dec 16, 2021, 04:45 PM
• Last activity: Dec 22, 2021, 08:44 PM
1
votes
1
answers
101
views
Pairing Query Store Read/Write Status with Enabled Status
I am able to find if Query Store is enabled: SELECT name, is_query_store_on from sys.databases I am able to query a specific database to determine the Query Store Read/Write Status: SELECT actual_state_desc from sys.database_query_store_options How do I join the two results to have the information i...
I am able to find if Query Store is enabled:
SELECT name, is_query_store_on from sys.databases
I am able to query a specific database to determine the Query Store Read/Write Status:
SELECT actual_state_desc from sys.database_query_store_options
How do I join the two results to have the information in one table? They do not have a common field, and I assume I could write a
CROSS JOIN
to match the two tables up, but am unsure how without that common field.
CrushingIT
(33 rep)
Dec 16, 2021, 07:00 PM
• Last activity: Dec 16, 2021, 08:02 PM
0
votes
1
answers
281
views
Add a list of names to a generated series
I generated a date series this way: ``` SELECT generate_series(date '2020-01-01', date '2021-01-21', '1 day'); ``` Additionally I have a list with name (Max Meyer, Anna Smith, Peter Gardner). How could I add the name list to the date series, such that the name appears for every date in the series?
I generated a date series this way:
SELECT generate_series(date '2020-01-01', date '2021-01-21', '1 day');
Additionally I have a list with name (Max Meyer, Anna Smith, Peter Gardner).
How could I add the name list to the date series, such that the name appears for every date in the series?
wladi
(1 rep)
Jul 2, 2021, 03:49 PM
• Last activity: Jul 23, 2021, 08:46 AM
Showing page 1 of 15 total questions