Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

2 votes

1 answers

3999 views

MySQL passwords SHA512 using AES_ENCRYPT instead of ENCRYPT?

[A tutorial][1] is asking me to encrypt passwords using SHA512 in the following manner: INSERT INTO `vir_users` (`id`, `domain_id`, `password` , `email`) VALUES ('1', '1', ENCRYPT('pw1', CONCAT('$6$', SUBSTRING(SHA(RAND()), -16))), 'email1@example.com'), ('2', '1', ENCRYPT('pw2', CONCAT('$6$', SUBST...

                                  A tutorial  is asking me to encrypt passwords using SHA512 in the following manner:

    INSERT INTO vir_users
    (id, domain_id, password , email)
    VALUES
    ('1', '1', ENCRYPT('pw1', CONCAT('$6$', SUBSTRING(SHA(RAND()), -16))), 'email1@example.com'),
    ('2', '1', ENCRYPT('pw2', CONCAT('$6$', SUBSTRING(SHA(RAND()), -16))), 'email2@example.com');

How would I replace this with a MySQL 5.7 non-deprecated AES_ENCRYPT instead of the old ENCRYPT way of doing things?

MySQL's documentation is severely lacking in this case at it does NOT provide any clear examples.

I found a way of doing it outside the DBMS, however, I would like to keep it all contained and not dependent on external code and/or tools; that's why would like to know.

When I run this query now, I get this error message:

>'ENCRYPT' is deprecated and will be removed in a future release. Please use AES_ENCRYPT instead

**update:**
The reason it's using this is that I need to store the password in such a way that Dovecot's default_pass_scheme (which is set to SHA512-CRYPT) can validate my passwords.

The $6$ part of the passwords seems to be important.. I don't know why, but all passwords that are generated the old way start with that. Please check the tutorial on why this is.
                                

Henry van Megen (65 rep)

Sep 7, 2017, 10:38 AM • Last activity: Jul 30, 2025, 09:00 AM

4 votes

1 answers

3252 views

Should I use the deprecated MD5 function in SQL Server?

sql-server sql-server-2017 hashing

We'd like to use **MD5** for our hashing function, instead of **SHA_256**, but as of SQL Server 2016, MD5 is deprecated. We're using this for hashing (comparing which records have changed). We now have this dilemma of using of risking it by using this function, or incurring storage and performance o...

                                  We'd like to use **MD5** for our hashing function, instead of **SHA_256**, but as of SQL Server 2016, MD5 is deprecated.

We're using this for hashing (comparing which records have changed).  
We now have this dilemma of using of risking it by using this function,  
or incurring storage and performance overhead of using SHA_256.

It's frustrating Microsoft decided to deprecate these functions even though they are still useful in certain scenarios. 

This project isn't a critical component of the business.

We'll likely go with SHA_256, but is this the right choice?  
Should new development always avoid deprecated functions? 

For context - daily will be about 1-2 million rows,  
upserting into a 400 million row table, comparing hashbytes on the fly.  
About 30 columns wide.

https://learn.microsoft.com/en-us/sql/t-sql/functions/hashbytes-transact-sql?view=sql-server-2017 

https://dba.stackexchange.com/questions/35219/choosing-the-right-algorithm-in-hashbytes-function 
                                

Gabe (1396 rep)

Feb 7, 2019, 02:38 PM • Last activity: Jul 19, 2025, 05:50 AM

0 votes

1 answers

1538 views

Oracle generating hash_val from a clob

oracle ddl hashing

I have the following table, where I would like the value of hash_val to be generated from clob_val whether clob_val is inserted or updated. Can this be done perhaps by making the column hash_val a virtual column? Or does this have to be done via a function and trigger. I want to use something like t...

                                  I have the following table, where I would like the value of hash_val to be generated from clob_val whether clob_val is inserted or updated.

Can this be done perhaps by making the column hash_val a virtual column? Or does this have to be done via a function and trigger.

I want to use something like the CRYPTO call below to generate the value for hash_val. In addition, I would like hash_val to be the primary KEY 

Any examples  setting up the virtual column if possible would be greatly appreciated. Thanks to all who answer.

~~~

CREATE table table_z(
seq_num integer  GENERATED BY DEFAULT AS IDENTITY (START WITH 1) NOT NULL,
val NUMBER, 
hash_val VARCHAR2(1000), 
clob_val CLOB);

DBMS_CRYPTO.HASH (clob_val, 6                          /*DBMS_CRYPTO.HASH_SH512*/                                              )

~~~

Pugzly (105 rep)

Dec 8, 2021, 05:03 PM • Last activity: Mar 5, 2025, 01:06 AM

0 votes

1 answers

97 views

Make LIKE 'abc%' query faster?

postgresql performance index postgresql-performance hashing

Will a hash index make this query faster? ``` LIKE 'abc%' ``` Hash indices can speed up point queries.

Will a hash index make this query faster?

LIKE 'abc%'

Hash indices can speed up point queries.

Marlon Brando (141 rep)

Jan 26, 2025, 11:39 AM • Last activity: Feb 3, 2025, 06:09 PM

0 votes

1 answers

896 views

Is there a cross-platform way to compare data in one columnd on each sitde of replicated data, like a checksum or hash?

sql-server oracle replication hashing checksum

I have an Oracle 12 database with lots of tables, and I am replicating several of the tables (a subset of rows) into a SQL Server 2016 database. The subset of rows can be established with a WHERE clause on the Oracle side. I have two web services that can expose anything I want to from that data, on...

                                  I have an Oracle 12 database with lots of tables, and I am replicating several of the tables (a subset of rows) into a SQL Server 2016 database.   The subset of rows can be established with a WHERE clause on the Oracle side.

I have two web services that can expose anything I want to from that data, one on each side.

**Do you have a suggestion for an approach of what I can expose, then compare to find out if the data between the systems matches?**  

I am currently exposing from one table, which has a few million rows the COUNT(*), which is a no-brainer since it is very efficient. So far, so good.

I'm also exposing the SUM of each of a few NUMBER(18,2) columns and comparing it to the corresponding SUM on the SQL Server side.  However, this is problematic, as it has to scan the whole table in SQL Server; it is sometimes blocked, and sometimes might cause other processes to block.  I imagine similar problems could occur on the Oracle side too.

Also, the SUM will not tell me if the rows match--it will only tell me that the totals match; if an amount was improperly added to one row and subtracted from another, I wouldn't catch it.

I've pondered whether Oracle's STANDARD_HASH might help me, but it seems cumbersome or error-prone to try to generate the exact same HASH on the SQL Server side, and also this doesn't help with the inefficiency/blocking nature of the call.

So is there any way to have both databases **keep track of a hash, checksum, CRC code, or other summary of a column's data, that is efficient to retrieve**, and that I can then use to compare to have some idea whether data is the same on both sides?  It need not be a perfect solution--for example, comparing SUMs is close but perhaps not quite good enough.

As a first stab, I created an "summary" indexed view, with columns derived from SUMs, on the SQL Server side.  This makes querying the view very fast, but incurs additional penalty on every write to the large table underneath.  Still, I think it will work, but I'd like to improve on it.  Other, better ideas?

Patrick Szalapski (153 rep)

Mar 10, 2021, 05:30 PM • Last activity: Jan 7, 2025, 04:07 AM

1 votes

0 answers

42 views

Execution plan showing high hash match

sql-server sql-server-2008-r2 index-tuning view hashing

i have a sql query running from database 1 that joins table **company** on to a view SELECT DISTINCT RTRIM([K].[cmpCompanyCode]) AS [cmpCompanyCode] FROM [dbo].[view] AS [K] JOIN [database2].[dbo].[Company] AS [C] ON [K].[cmpCompanyCode] = [C].[CmpCompanyCode]; this is my view SELECT RTRIM([CM].[fkO...

                                  i have a sql query running from database 1 that joins table **company** on to a view 

    SELECT DISTINCT RTRIM([K].[cmpCompanyCode]) AS [cmpCompanyCode]
    FROM [dbo].[view] AS [K]  
    
    JOIN [database2].[dbo].[Company] AS [C]  
    ON [K].[cmpCompanyCode] = [C].[CmpCompanyCode];
    
this is my view

    SELECT RTRIM([CM].[fkOrganizationIDParent]) AS [fkOrganizationIDParent],
    		  [C].[cmpCoID],
    		  RTRIM([CE].[ClientNumber]) AS [cmpCompanyCode],
    
    		  ISNULL([CE].[z_DoNotReplicate], 0) AS [DoNotReplicateSupervisors],
    		  ISNULL([CE].[z_StopPTOReplication], 0) AS [StopPtoReplication]		
    	FROM   [dbo].[Client] AS [CM] WITH(NOLOCK)
    		  JOIN [dbo].[Entity] AS [CE] WITH(NOLOCK) ON [CM].[fkProjectID] = [CE].[fkProjectID]
    		  JOIN [database2].[dbo].[Company] AS [C] ON [CE].[ClientNumber] = [C].[CmpCompanyCode]
    	WHERE  [CM].[z_K] = 1
    		  AND ISNULL([CE].[z_Exclude], 0) = 0	
    		  AND [CM].[crStatus]  'Terminated'
    		  AND [CM].[IsActive] = 1
    		  AND [CE].[IsActive] = 1
    		  AND ISNULL(RTRIM([CE].[ClientNumber]), '')  '';

but when i run the first query my execution plan shows me a high hash.

Output List says **dbo.company.CmpCoid, expr1016**




i have indexes on the dbo.company table as follows

    CREATE NONCLUSTERED INDEX [IDX_CompanyCode] ON [dbo].[Company]
    (
    	[CmpCompanyCode] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
    GO

    CREATE CLUSTERED INDEX [CDX_CompanyName] ON [dbo].[Company]
    (
    	[CmpCompanyName] ASC,
    	[CmpCoID] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 85) ON [PRIMARY]
    GO

im not sure how to fix this? 
                                

deanpillow (11 rep)

Feb 15, 2024, 04:26 PM • Last activity: May 3, 2024, 11:25 AM

0 votes

0 answers

78 views

How do I store a password in such a way that I can obtain the plain-text password again?

encryption password hashing api

Consider the following scenario: My app is utilizing a third-party API which requires a username and password for authentication. Unlike most APIs, which have one username and password set for the developer to make calls, this API has a unique username and password pair for each customer. Thus, each...

                                  Consider the following scenario:

My app is utilizing a third-party API which requires a username and password for authentication. Unlike most APIs, which have one username and password set for the developer to make calls, this API has a unique username and password pair for each customer. Thus, each customer using my app will need to input their username and password for their account on this other service, into my app. I then need to store this data in my own database, as I will need to call the third-party API on my customer's behalf repeatedly without asking them for their credentials over and over. (Several well-known apps use this model, such as the fintech company Plaid.)

I know that the de-facto standard for storing passwords in a database is by using a one-way hashing algorithm. However, in my case, I can't see how this would work, as I need to present a plain-text password to the third-party API, and cannot do so if I've only stored hashed information.

I've had trouble finding clear or trusted guidance about how to proceed in this situation. I'm open to the idea that there are other avenues with this scenario I haven't considered. How can I store the passwords in a secure way (if someone got access to my database, they couldn't get their hands on plain text passwords) in this situation?

JCollier (101 rep)

Mar 19, 2024, 04:45 AM • Last activity: Mar 19, 2024, 03:30 PM

0 votes

1 answers

2241 views

Computing hashes in PostgreSQL for JSONB columns

postgresql hashing jsonb

From [PostgreSQL docs][1]: > `jsonb` does not preserve white space, does not preserve the order of object keys, and does not keep duplicate object keys. If duplicate keys are specified in the input, only the last value is kept. I have a `jsonb` column that contains some data. I extract specific key...

From PostgreSQL docs : > jsonb does not preserve white space, does not preserve the order of object keys, and does not keep duplicate object keys. If duplicate keys are specified in the input, only the last value is kept. I have a jsonb column that contains some data. I extract specific key data (which is also json) and hash it using sha256, something like:

SELECT sha256(to_jsonb(rule_element) :: TEXT :: BYTEA);

Where rule_element was extracted from the original data. Given that it does not maintain order of keys and the original column of jsonb may be updated in the future, I do not think it is safe to compute hashes as they might change. Can I compute consistent hashes for jsonb columns? (99% Sure you can't, 1% hope to make my work easier) The hashes would be used for unique identification purposes, basically acting as a unique key for different types of JSON compositions.

VIAGC (103 rep)

Jan 17, 2024, 05:27 AM • Last activity: Jan 17, 2024, 11:28 AM

0 votes

1 answers

792 views

Generating MD5 Hash for NVARCHAR and VARCHAR columns across different databases

unicode hashing netezza azure-synapse-analytics

I am building a utility that generates a single MD5 hash across all CONCATENATED column values for any table row. To make this work, I have to eliminate NULL values using COALESCE and must CAST NUMBER and DATE types to VARCHAR using consistent patterns. The purpose of this utility is to compare all...

                                  I am building a utility that generates a single MD5 hash across all CONCATENATED column values for any table row. To make this work, I have to eliminate NULL values using COALESCE and must CAST NUMBER and DATE types to VARCHAR using consistent patterns. 

The purpose of this utility is to compare all the data across two different databases as part of a data migration. We wanted to make sure all of our converted ETL code (stored procedures) produced the same results with 100% accuracy after porting it from Netezza to Azure Synapse Analytics.

For the most part, this works extremely well and we were able to identify numerous bugs using this approach but still, this process isn't perfect ... We cannot use this with BLOB columns and we sometimes get slight differences with FLOAT types. But the real headache is how NVARCHAR is hashed differently between Netezza and Azure Synapse Analytics.
 
Here is an example... First, I will demonstrate what Netezza gives me.

    show nz_encoding
    -- NOTICE: UTF8
    
    show server_encoding
    --  Current server encoding is LATIN9
    
    CREATE TABLE ENCODING_TEST (
        VALUE1  CHARACTER VARYING(255),
    	VALUE2  NATIONAL CHARACTER VARYING(255)
    );
    -- The command completed successfully
    
    INSERT INTO ENCODING_TEST (VALUE1, VALUE2) VALUES('très bien', 'très bien');
    -- 1 rows affected
    
    SELECT RAWTOHEX(HASH('très bien', 0)) FROM ENCODING_TEST
    -- E5D128AFD34139A261C507DA18B3C558
    
    SELECT RAWTOHEX(HASH(VALUE1, 0)) FROM ENCODING_TEST
    -- E5D128AFD34139A261C507DA18B3C558
    
    SELECT RAWTOHEX(HASH(VALUE2, 0)) FROM ENCODING_TEST
    -- A54489E883CE7705CDBE1FDAA3AA8DF4
     
    SELECT RAWTOHEX(HASH(CAST(VALUE2 AS VARCHAR(255)), 0)) FROM ENCODING_TEST
    -- E5D128AFD34139A261C507DA18B3C558

And here is what Azure Synapse Analytics gives me...

    CREATE TABLE WZ_BI.ENCODING_TEST (
        VALUE1  CHARACTER VARYING(255),
    	VALUE2  NATIONAL CHARACTER VARYING(255)
    );
    -- Commands completed successfully.
    
    INSERT INTO WZ_BI.ENCODING_TEST (VALUE1, VALUE2) VALUES('très bien', 'très bien');
    -- (1 row affected)
    
    SELECT HASHBYTES ('MD5', 'très bien') FROM WZ_BI.ENCODING_TEST
    -- 0xE5D128AFD34139A261C507DA18B3C558
    
    SELECT HASHBYTES ('MD5', VALUE1) FROM WZ_BI.ENCODING_TEST
    -- 0xE5D128AFD34139A261C507DA18B3C558
    
    SELECT HASHBYTES ('MD5', VALUE2) FROM WZ_BI.ENCODING_TEST
    -- 0xC43534A6812499038457EDF545834866
    
    SELECT HASHBYTES ('MD5', CAST(VALUE2 AS VARCHAR(255))) FROM WZ_BI.ENCODING_TEST
    -- 0xE5D128AFD34139A261C507DA18B3C558

The question I have is how come the MD5 hash for the NVARCHAR column in Netezza is different than the MD5 hash of the same type and value in Azure Synapse Analytics? I mean it treats the VARCHAR types the same? I really do not want to have to explicitly CAST all NVARCHAR types to VARCHAR to get these to work, but I have not found any other way to make them equivalent.

What am I missing here?
                                

Lauren_G (69 rep)

Jun 29, 2023, 02:38 AM • Last activity: Jan 12, 2024, 02:05 AM

4 votes

1 answers

3879 views

Create Postgres hash aggregate function

postgresql aggregate hashing

Postgres: 15.1 My goal is simple: I want to create an aggregated hash value for x rows from column c in table z. Example Input: ``` SELECT hash_agg(c) AS checksum FROM z; ``` Example Output: ``` checksum 51jj5l1jl55 ``` Unfortunately, this seems to be such an irrelevant use case, that it's not imple...

Postgres: 15.1 My goal is simple: I want to create an aggregated hash value for x rows from column c in table z. Example Input:

SELECT hash_agg(c) AS checksum FROM z;

Example Output:

checksum
51jj5l1jl55

Unfortunately, this seems to be such an irrelevant use case, that it's not implemented by default. To be more precise: I got the following function to aggregate all my values and hash them with md5:

SELECT md5(string_agg(c, '')) AS checksum FROM z;

This may work with around 5 rows, but I am expecting to crash the function due to the high input. It may also be inadequate for hashing a lot of data. EDIT: I did manage to build an md5 aggregate function that does not need to build an extreme big array or string beforehand. (Please correct me if array_agg() can handle a million rows without being a memory concern) As @Erwin Brandstetter points out, md5 might not be the best solution in terms of speed and i would love to use everything more efficient if the implementation of that algorithm is open source. It also seems to be a waste of processing power to rehash the existing state

`
md5(state || input::bytea)

` If there is a way to 'update' the existing hash with new values, as if it would have been the same as if hashing the whole dataset from the beginning, i would be grateful. (And if it is not possible in Postgres, please tell me so i don't have to look any further)

DROP AGGREGATE IF EXISTS md5_agg(anyelement);
DROP FUNCTION IF EXISTS md5_agg_state_func;

CREATE OR REPLACE FUNCTION md5_agg_state_func(state bytea, input anyelement)
  RETURNS bytea AS $$
BEGIN
  IF input IS NOT NULL THEN
    RETURN md5(state || input::bytea);
  ELSE
    RETURN state;
  END IF;
END;
$$ LANGUAGE plpgsql IMMUTABLE;

CREATE OR REPLACE FUNCTION md5_agg_final_func(state bytea)
  RETURNS text AS $$
BEGIN
  RETURN encode(state, 'escape');
END;
$$ LANGUAGE plpgsql IMMUTABLE;


CREATE OR REPLACE AGGREGATE md5_agg(anyelement)
(
  sfunc = md5_agg_state_func,
  stype = bytea,
  finalfunc = md5_agg_final_func,
  initcond = ''
);

Yuki (41 rep)

Jun 28, 2023, 10:10 PM • Last activity: Jun 29, 2023, 10:06 AM

6 votes

2 answers

2548 views

Byte ordering for multibyte characters in SQL Server versus Oracle

sql-server oracle encoding unicode hashing

I am currently in the process of migrating data from Oracle to SQL Server and I'm encountering an issue trying to validate the data post-migration. Environment Details: * Oracle 12 - AL32UTF8 character set * Client - NLS_LANG - WE8MSWIN1252 * VARCHAR2 field SQL Server 2016 * Latin1_General_CI_AS col...

                                  I am currently in the process of migrating data from Oracle to SQL Server and I'm encountering an issue trying to validate the data post-migration.

Environment Details:

* Oracle 12 - AL32UTF8 character set
* Client - NLS_LANG - WE8MSWIN1252
* VARCHAR2 field

SQL Server 2016 

* Latin1_General_CI_AS collation
* NVARCHAR field

I'm using DBMS_CRYPTO.HASH on Oracle to generate a checksum of the whole row, then copying to SQL and using HASHBYTES to generate a checksum of the whole row, which I'm then comparing to validate the data matches.

The checksums match for all rows, except those with multibyte characters.

For example, rows with this character: ◦ do not match in the checksums, even though the data is transferred correctly. When I use DUMP in Oracle or convert to VARBINARY in SQL Server the data matches exactly except for the bytes for this character.

In SQL Server, the bytes are 0xE625 and in Oracle they are 0x25E6.

Why are they ordered differently, and is there a reliable way to convert one to the other to ensure the checksum at the other end matches for strings with multi-byte characters?

HandyD (10432 rep)

Feb 7, 2019, 03:12 AM • Last activity: Jun 27, 2023, 05:38 PM

1 votes

0 answers

63 views

Issue with hash storage resulting from password_verify() output

mariadb php password encoding hashing

Consider the following code: ``` $random_token = random_bytes(32); $token_hash = password_hash( $random_token, PASSWORD_DEFAULT ); $token_base64 = sodium_bin2base64( $random_token, SODIUM_BASE64_VARIANT_URLSAFE_NO_PADDING ); ``` `$token_hash` is then stored in a DB field of type `LONGTEXT` of the co...

Consider the following code:

$random_token = random_bytes(32);

$token_hash = password_hash(
  $random_token,
  PASSWORD_DEFAULT
);

$token_base64 = sodium_bin2base64(
  $random_token,
  SODIUM_BASE64_VARIANT_URLSAFE_NO_PADDING
);

$token_hash is then stored in a DB field of type LONGTEXT of the collation utf8mb4_unicode_ci. You do so by making a normal insert value using PDO, and to connect with PDO, you use:

$pdo = new PDO(
                dsn:      "mysql:host=host;dbname=db_name;charset=utf8mb4",
                username: 'username',
                password: 'password',
                options:  [
                              // Throw PDO Exception in case of any error
                              PDO::ATTR_ERRMODE            => PDO::ERRMODE_EXCEPTION,
                              // Fetch the obtained result as associative array by default
                              PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC
                          ]
            );

Upon retrieval of the token as base64-encoded URL parameter, you then retrieve the hash from the database using the same PDO connection syntax as above, and then do:

$random_token = sodium_base642bin(
  $token_base64,
  SODIUM_BASE64_VARIANT_URLSAFE_NO_PADDING
); 

$result = password_hash($random_token,$token_hash);

In like 10% of the cases, we get false for $result. How can this be? Are we maybe using the wrong collation / charset? Is it a problem that the PDO instance uses a slightly different charset vs the collation of the concerned field? ***UPDATE OF A CASE WHICH FAILED***: An example of a failed hash verification occurred with this hash here: $2y$10$BkpF2otty42SSZS4dczmkeTJk7FJeqgt0D3oCuuOuqCKZG97OZtGu

DevelJoe (163 rep)

Apr 25, 2023, 12:10 AM • Last activity: May 15, 2023, 08:59 PM

0 votes

0 answers

60 views

Application Role - Reset Password using hash or Encryption form

sql-server sql-server-2019 role password hashing

I need to automate reset password of database application role. Password must be encrypted or hash form due to security reason. I tried to convert password in hash form using 'Hashbytes' as mentioned below but didn't work. SELECT HASHBYTES('SHA2_256','PASSWORD') GO USE GO ALTER APPLICATION ROLE [app...

                                  I need to automate reset password of database application role. Password must be encrypted or hash form due to security reason.

I tried to convert password in hash form using 'Hashbytes' as mentioned below but didn't work.

    SELECT HASHBYTES('SHA2_256','PASSWORD')
    GO
    USE 
    GO
    ALTER APPLICATION ROLE [approle_name] WITH PASSWORD= 'password hash value'
    GO

I have investigated around password reset of application role but no much able to find out

Are you able to help how can I achievement it or is it possible to do so?


Thanks in Advance
sonny
 
                                

SQLHoncho (1 rep)

Feb 10, 2023, 07:07 AM • Last activity: Apr 14, 2023, 10:50 PM

0 votes

2 answers

1082 views

Postgresql 12 Custom Hash Function

postgresql partitioning hashing

I have a large table that I would like to partition. The field I want to partition against is a string that contains a coded value, but appears as an integer stored as a string. The "number" is made up of 4 digits. The first two are what I want to partition against. For instance, the field value of...

                                  I have a large table that I would like to partition. The field I want to partition against is a string that contains a coded value, but appears as an integer stored as a string. The "number" is made up of 4 digits. The first two are what I want to partition against. For instance, the field value of "1298", would need to partition using the value 12. I thought this would be a great use for a hash partition with a custom hash function.

**For some reference:**
Queries are frequently made against this column in the form of ... WHERE TheField = '0123' or ... WHERE TheField IN ('0123', '0145', '0232')

It would be impractical to manually each value to go into each partition, so I thought using a custom hash function would be a practical solution.

**Actual Question:** How do I make this custom hash function and use it to define the partitions?

techdude1996 (1 rep)

Sep 14, 2021, 04:41 PM • Last activity: Feb 13, 2023, 11:03 PM

0 votes

1 answers

253 views

How exactly are the password hashing functions in the pgcrypto module different from the regular MD5 and other algorithms?

postgresql encryption hashing pgcrypto

This is not technically a database question, but about the innards of Postgres's [pgcrypto extension][1].The documentation says > The algorithms in crypt() differ from the usual MD5 or SHA1 hashing > algorithms in the following respects: I would like to know how/where might I be able to find the det...

                                  This is not technically a database question, but about the innards of Postgres's pgcrypto extension .The documentation says 

> The algorithms in crypt() differ from the usual MD5 or SHA1 hashing
> algorithms in the following respects:

I would like to know how/where might I be able to find the details of the hashing algorithms.

Ideally, I would like to compare the algorithms of pgcrypto with the regular versions of the algorithms.

ahron (833 rep)

Jan 15, 2023, 03:05 PM • Last activity: Jan 15, 2023, 05:29 PM

1 votes

2 answers

83 views

Concurrency Control B-Tree between Hash

transaction concurrency isolation-level hashing btree

I am new to transaction. When I read the book "Transactional Information Systems", it is mentioned that B-tree uses key-range locking to guarantee serializability. **Is that true all accesses to the relation must be conducted via the same B-tree to guarantee serializability?** If I understand correc...

                                  I am new to transaction. When I read the book "Transactional Information Systems", it is mentioned that B-tree uses key-range locking to guarantee serializability. **Is that true all accesses to the relation must be conducted via the same B-tree to guarantee serializability?**

If I understand correctly, **there is no way to guarantee serializability if accesses are via different indexes**. See the following example.

Suppose there are two indexes on the same relation, namely, a B-tree and a hash index, and there are two transactions.

First, one transaction reads the relation using the hash index, e.g., fetchkey(16) to fetch the (currently absent) record with key 16 and find no match. Since hash does not have key-range locking, it cannot lock the next key. Then, the other transaction inserts a record with key '16'  via the B-tree. The insertion will be successful, leading to a phantom.

GenGG (15 rep)

Aug 11, 2022, 02:10 AM • Last activity: Sep 8, 2022, 03:14 AM

0 votes

1 answers

466 views

Looking for file-based DB engine alternative to MS Access

ms-access c# ado.net hashing

My company has been chained to this MS Access '97 .MDB format we've had since the dawn of time. Our schema is hierarchical, with multiple one-to-many relationships. We are at a point where we are looking for alternatives, as the slowness and overall clunkiness of Access are beginning to take their t...

                                  My company has been chained to this MS Access '97 .MDB format we've had since the dawn of time. Our schema is hierarchical, with multiple one-to-many relationships. We are at a point where we are looking for alternatives, as the slowness and overall clunkiness of Access are beginning to take their toll on our productivity.

Our "modern" methods of accessing the database involve DAO.Net and heavy amounts of hash-based caching. The .NET System.Collections.Generic.Dictionary type has been a god-send here, because without it, I don't know how we would get our work done in a timely manner. We have multiple projects that each have a database file associated with it (sometimes multiple), and we tend to interact them one of two ways: either the DB is created by hand (using our in-house editor), or generated using a program which takes data we receive from another company, in some other format, and converts it to our format. 

In both cases, our common .NET library loads the entire database into hash tables via Dictionary and resolves the object relationships with code by looking up values in the hash table by ID. When auto-generating the database, we use another set of hash tables to determine whether an object exists already in the cache before adding it. Once we are finished parsing the source data, we start a multi-threaded bulk insertion operation. We do all this because any other method of accessing the database is very slow.

I hope I've given enough context to my question: Is there a DB engine out there whose query speed can rival that of hash tables like what I am using? Memory and disk usage are no concern, these DB only exist on developer machines, we convert them to a different format for use with our software. I just want to get rid of my hash tables, but I don't want to sacrifice speed to do it.

Kate M (3 rep)

Aug 24, 2022, 01:24 AM • Last activity: Aug 24, 2022, 03:54 AM

1 votes

1 answers

762 views

HASHBYTES() result of another HASHBYTES(), OR CHECKSUM_AGG() aggregate for HASHBYTES()?

sql-server t-sql aggregate hashing

I want to aggregate a single HASHBYTES() from the results of another HASHBYTES()... something like CHECKSUM_AGG() does for CHECKSUM()... I have been able find how to make a single HASH per row of columns from a table: SELECT hashbytes('MD5', ( SELECT SPECIFIC_SCHEMA ,SPECIFIC_NAME ,ROUTINE_SCHEMA ,R...

                                  I want to aggregate a single HASHBYTES() from the results of another HASHBYTES()... something like CHECKSUM_AGG() does for CHECKSUM()...

I have been able find how to make a single HASH per row of columns from a table:

    SELECT 
    	hashbytes('MD5', (
    			SELECT SPECIFIC_SCHEMA
    				,SPECIFIC_NAME
    				,ROUTINE_SCHEMA
    				,ROUTINE_NAME
    				,ROUTINE_TYPE
    				,ROUTINE_DEFINITION
    			FROM (
    				VALUES (NULL)
    				) foo(bar)
    			FOR XML auto
    			)) AS [Hash]
    FROM INFORMATION_SCHEMA.ROUTINES AS MBT;

How do I aggregate to a single hash from all the hash rows returned?

(Yes, I want to reduce a table to a single hash for comparison to table(s) in other databases).

                                

Ricky Lively (54 rep)

Jul 21, 2022, 11:33 PM • Last activity: Jul 22, 2022, 12:09 AM

0 votes

0 answers

117 views

Cache Hit Rate decrease postgres

postgresql hashing

On one of our servers, I found a problem: for the last three months, this indicator has been steadily decreasing, already from 95% to 89%. With what it can be connected? How to understand which requests do not fall into the hash?

                                  On one of our servers, I found a problem: for the last three months, this indicator has been steadily decreasing, already from 95% to 89%. With what it can be connected? How to understand which requests do not fall into the hash?
                                

AlexandrT (11 rep)

Jun 16, 2022, 12:31 PM

0 votes

2 answers

128 views

SQL Server - Finding Parent Source from T-SQL Snippet

sql-server query hashing

The environments for which I'm responsible have some pretty substantial plan non-reuse challenges. I've run across the following query (h/t Brent Ozar blog commenter Michael J Swart) that does a fine job of itemizing the worst offenders: ``` WITH cte AS ( SELECT COUNT(*) [count], query_hash, min(sql...

WITH cte AS (
   SELECT COUNT(*) [count], query_hash, min(sql_handle) [sql_handle_example]
   FROM sys.dm_exec_query_stats
   GROUP BY query_hash
   )
SELECT cte.*, t.text [query_text_example]
FROM cte
CROSS APPLY sys.dm_exec_sql_text(sql_handle_example) t
WHERE [count] > 100
ORDER BY [count] DESC

My challenge is taking a snip of the [query_text_example] text and efficiently identifying whether it's originating from a sproc, and if so which one in which database. I've done some Googling and testing and it's been puzzlingly difficult to find a solution that takes a snippet of query text, whether it was dynamically built or not, and accurately identifies its parent sproc. Does anyone have any suggestions?

AccidentalDBA_CO (157 rep)

Apr 27, 2022, 07:02 PM • Last activity: May 1, 2022, 02:30 PM

Showing page 1 of 20 total questions