Sample Header Ad - 728x90

Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes
1 answers
42 views
Should I create 2 different databases if they would keep 70% similar data?
I am writing a microservices based NodeJS-Cassandra application and I haVe few services they would need 70% similar data( like username, avatar, videos, etc.) and I am just wondering to know is it a good idea that I create 2 databases like: DB1: > A,B,C,D,E,F,G,H,I,J,K,L,M,N DB2: > A,B,C,D,E,F,G,H,I...
I am writing a microservices based NodeJS-Cassandra application and I haVe few services they would need 70% similar data( like username, avatar, videos, etc.) and I am just wondering to know is it a good idea that I create 2 databases like: DB1: > A,B,C,D,E,F,G,H,I,J,K,L,M,N DB2: > A,B,C,D,E,F,G,H,I,J,W,X,Y,Z Or just create one database like: > A,B,C,D,E,F,G,H,I,J,K,L,M,N,W,X,Y,Z I know from microservices concept that we should give it's own database to each service, but as long as my data contains some massive data like video files, I don't know what should I do in this specific case? I like to know if there is any rule of thumb says **"if your databases would have more than x-% of similar data, or the data you should replicate in both databases exceeds x-GB, it's better to keep them in 1 database"**?
best_of_man (117 rep)
May 22, 2023, 05:34 PM • Last activity: May 23, 2023, 12:04 AM
2 votes
2 answers
68 views
Complicated relationship Q
I'm trying to do a query to find products that *aren't* on the customer's authorized products list, and it's breaking my brain. Each customer has an authorized product group list, and within a product group each entry might consist of a Product ID, a Product Category ID, Vendor Number, Price Schedul...
I'm trying to do a query to find products that *aren't* on the customer's authorized products list, and it's breaking my brain. Each customer has an authorized product group list, and within a product group each entry might consist of a Product ID, a Product Category ID, Vendor Number, Price Schedule ID, or Brand ID (all different columns on the Product record). Many authorized product group lists have a mix of 2 or more kinds of entries, and often there's some overlap between the different columns used as selection criteria. Of course, there's a code on the product group list item that indicates which column the entry is based on, so it's easy enough to parse. On the Product side, I'd be looking for products without an "Inactive" flag that aren't on the authorized list for each customer. The tables look roughly like this: **Customer** CustomerID |AuthorizedListID ----------|---------------- 1 |345 2 |345 3 |117 **Authorized List** ListID |ListName ------|----------------- 117 |Frozen goods 345 |Pizza **Authorized List Items** ListKey |ItemRefNo |Type -------|---------|---- 117 |68 |BRND 117 |202 |ITEM 117 |1415 |ITEM 117 |3012 |ITEM 117 |61152 |CATG 117 |31667 |PRIC 345 |9615 |ITEM 345 |2100 |ITEM 345 |13229 |VNDR 345 |87211 |CATG **Products** ProductID|CategoryKey|PriceKey|VendorKey|BrandKey|Inactive ---------|-----------|--------|---------|--------|-------- 202 |61152 |113 |45529 |16 |0 1415 |4990 |5601 |882 |41009 |0 2100 |61152 |37772 |9101 |20900 |0 3557 |87211 |4101 |17743 |23777 |0 4027 |5193 |644 |882 |68 |0 4028 |5193 |580 |882 |68 |0 9615 |87211 |1512 |17743 |8738 |0 13017 |87211 |4040 |13229 |23795 |0 42500 |87211 |20155 |17743 |4355 |0 53529 |1011 |31667 |13229 |602 |1 While I have a fair bit of SQL experience, this has a level of complexity that's beyond me at this point...so part of what's messing me up is that I don't even know how to succinctly and accurately describe what I'm trying to do so I can do a search for it. I don't even know the appropriate tags for this question. Also, I forgot to mention earlier: T-SQL for SQL Server 2016R2. Expected results would be a set of records with CustomerID and ProductID columns where the product wasn't in the authorized list for the customer and Inactive = 0. Ideally, this would be a view. I've gotten as far as this query: SELECT C.*, A.ItemRefNo, A.Type, CASE WHEN CA.ProductId IS NOT NULL THEN CA.ProductId WHEN B.ProductId IS NOT NULL THEN B.ProductId WHEN V.ProductId IS NOT NULL THEN V.ProductId WHEN P.ProductId IS NOT NULL THEN P.ProductId WHEN A.Type = 'ITEM' THEN A.ItemRefNo ELSE NULL END AS ProductId FROM Customer C INNER JOIN authorizedListItems A ON C.AuthorizedListId = A.ListKey LEFT JOIN products CA ON A.ItemRefNo = CA.CategoryKey AND A.Type = 'CATG' LEFT JOIN products B ON A.ItemRefNo = B.BrandKey AND A.Type = 'BRND' LEFT JOIN products V ON A.ItemRefNo = V.VendorKey AND A.Type = 'VNDR' LEFT JOIN products P ON A.ItemRefNo = P.PriceKey AND A.Type = 'PRIC' Query results: CustomerId|AuthorizedListId|ItemRefNo|Type|ProductId ----------|----------------|---------|----|--------- 3 |117 |68 |BRND|4027 3 |117 |68 |BRND|4028 3 |117 |202 |ITEM|202 3 |117 |1415 |ITEM|1415 3 |117 |3012 |ITEM|3012 3 |117 |61152 |CATG|202 3 |117 |61152 |CATG|2100 1 |345 |9615 |ITEM|9615 2 |345 |9615 |ITEM|9615 1 |345 |2100 |ITEM|2100 2 |345 |2100 |ITEM|2100 1 |345 |13229 |VNDR|13017 1 |345 |13229 |VNDR|53529 The result I'd like to see is: CustomerId|ProductId ----------|--------- 1 |202 1 |1415 1 |4027 1 |4028 2 |202 2 |1415 2 |4027 2 |4028 3 |3557 3 |9615 3 |13017 3 |42500 Let me say in advance, I know I wasn't filtering inactive items in the query and the results include superfluous columns.
asasaki (63 rep)
May 9, 2023, 05:55 PM • Last activity: May 11, 2023, 12:36 AM
-1 votes
2 answers
558 views
To inquire about the query complexity of the following query in SQL
I would like to inquire about the complexity of the following example query in SQL: ``` SELECT id from DB where Country = "India" AND Size = "Large" ``` That is, taking the AND of rows matching the filters for any two columns out of **m>>2** columns. I want to return the first column which contain t...
I would like to inquire about the complexity of the following example query in SQL:
SELECT id
from DB
where Country = "India"
    AND Size = "Large"
That is, taking the AND of rows matching the filters for any two columns out of **m>>2** columns. I want to return the first column which contain the ids. Can it be done in time sublinear in the number of rows? Thank you.
Neel Karia (1 rep)
Feb 14, 2022, 09:31 AM • Last activity: Feb 15, 2022, 03:04 AM
1 votes
2 answers
382 views
Does the PostgreSQL array inTOAST storage have random access times?
This question is related to PostgreSQL TOAST storage and the GIS.SE question: [Should TOAST compression be disabled for PostGIS?][1] *Basically, I was wondering if there is any **guarantee** of constant time complexity (O(1)) for the random access of array elements?* That is, to get `arr[n]`, is the...
This question is related to PostgreSQL TOAST storage and the GIS.SE question: Should TOAST compression be disabled for PostGIS? *Basically, I was wondering if there is any **guarantee** of constant time complexity (O(1)) for the random access of array elements?* That is, to get arr[n], is the worst case number of steps required a constant (i.e. O(1) ) or something else (O(log n) etc.)? I am asking because in certain data forms such as PostGIS lines or rasters, the main data is logically an array (of coordinates). It is known that access time for such data can be extremely long once the data amount exceed a certain limit (like 500 points). A possible reason is that data of such sizes are transferred to TOAST storage and potentially the data can be compressed (e.g. with the main storage). It's not clear how PostgreSQL can predict the approximate location of an element and still offer random access time. The access time for arrays in most programming languages is constant (O(1)). And that is the point of using an array. Just out of curiosity: *Do PostgreSQL arrays have constant access times?* (when and when not?)
tinlyx (3820 rep)
Oct 12, 2021, 09:45 AM • Last activity: Oct 13, 2021, 11:15 AM
Showing page 1 of 4 total questions