Sample Header Ad - 728x90

What is the best choose for PK in Partitioned Table?

2 votes
1 answer
673 views
I have one large table that is partitioned (table name: Trans). At the moment this table is to be created on 32 partitions. This table will contain approximately 300 million records and data older than 14 days will be deleted daily. One of the columns in this table is a reference to a table that will contain up to 5 million records (table name: Sens) and I also want it to be partitioned. I would like to ask you about: 1. Will it be a problem that both tables will use the same partitioning function? So the Sens table would also be distributed over 32 partitions and would be save on the same files as the Trans table. Is this a good approach ? 2. The Trans table has a PK based on two columns TranID (Identity (1,1)) and ParititionID. At the moment, FK to a smaller table ('Sens') is based on only one column - SenID. The smaller table also has to be partitioned. What will be the difference in the approach / efficiency / speed of operation if the PK in the Sens table will only be on the IDENTITY (1,1) column instead of the IDENTITY (1,1) column and the partition column, i.e.
ALTER TABLE [dbo].[Sen]
ADD CONSTRAINT [PK_SenID]
    PRIMARY KEY CLUSTERED ([SenID] ASC) ON [PRIMARY];

-- or 

ALTER TABLE [dbo].[Sen]
ADD CONSTRAINT [PK_SenID]
    PRIMARY KEY CLUSTERED (
                              [SenID] ASC,
                              [PartitionID]
                          ) ON [psTrans]([PartitionID])
3. Have you ever try to have partition column which is computed ? I am thinking about choose partition according to new column which is computed base on other column in table: CAST(HASHBYTES('MD5', [othercolumnInTable]) AS tinyint) % 32 --- Many thanks for the comprehensive answer. The idea is that there are 32 partitions, 16 files and 8 file group. In other words, each filegroup is supposed to contain 2 files (ie a total of 4 partitions). Honestly, it's my first time designing a large database where I have to create a new file group and use partitioning. Therefore, the above numbers are indicative. Do you have any way to properly divide into files, filegroups and partitions? Regarding the partitioning of the Trans table, the partition column will be of Tinyint type. Partitioning follows business logic and breaks all data (about 300 million records) into roughly equal parts (or at least that's the assumption). Thus, partitioning will not be by date, but by a column of type Tinyint. We want to take advantage of partitioning for the Trans table because it will contain a lot of data, ie about 300 million records. In addition, it will have about 60 columns. Moreover, the requirement is that the database could manage 300 inserts per second for this table and at the same time about 250 update operations on this table. So I understand that by partitioning this table, with many insert and update operations, we will be running multiple files at the same time which should speed up and handle the requirements. Although maybe my interpretation is wrong? In addition to the Sens table, which I am describing here, there will also be one Events table, which will have FK references to the PK of the Trans table and will contain about 100-200 million records. To summarize the tables I think to partition at the moment are Trans (about 300 million records), Sens (about 5 million records), Events (about 100-200 million records). All of them would use the same partitioning function, ie they would be present in 32 partitions, 16 files and 8 filegroups. There should be 300 inserts on the Trans table and 250 ~ 290 updates per second. There should be 200-300 update operations per second in the Sens table. There should be approximately 400-500 inserts per second in the Event table. The main reason to partition them all is not to do all of these operations on one database file, but to distribute it properly. You wrote that you have experience with partitioning. Do you think partitioning will be good for these requirements? As for data deletion. Every day, data older than 14 days will be removed from the Trans and Events tables. I thought to do it in such a way that with the operation Delete I delete data separately for each partition. I have no experience in this and I do not know if this is the most effective option. Moreover, the solution is to be created as part of AlwaysOn (so maybe there are some limits).
Asked by axdna (119 rep)
Sep 8, 2020, 12:01 PM
Last activity: Jun 1, 2024, 09:46 AM