Skip to content

HIVE-28655: Implement HMS Related Drop Stats Changes Part2 (param COLUMN_STAT_ACCURATE related changes) #5790

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

DanielZhu58
Copy link
Contributor

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

@@ -7431,6 +7431,13 @@ public boolean delete_column_statistics_req(DeleteColumnStatisticsRequest req) t
if (!isPartitioned || req.isTableLevel()) {
ret = rawStore.deleteTableColumnStatistics(parsedDbName[CAT_NAME], parsedDbName[DB_NAME], tableName, colNames, engine);
if (ret) {
Map<String, String> parameters = table.getParameters();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just remove the deleted columns from the COLUMN_STATS_ACCURATE? using the StatsSetupConst.removeColumnStatsState

}
partParams.put("COLUMN_STATS_ACCURATE", "false");
partition.setParameters(partParams);
rawStore.alterPartition(parsedDbName[CAT_NAME], parsedDbName[DB_NAME], tableName, partVals, partition, null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alter the partitions in bulk?

} else {
// remove the deleted column names in parameter COLUMN_STATS_ACCURATE
StatsSetupConst.removeColumnStatsState(table.getParameters(), colNames);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to persist the changed parameters by rawStore.alterTable()

} else {
// remove the deleted column names in parameter COLUMN_STATS_ACCURATE
StatsSetupConst.removeColumnStatsState(partParams, colNames);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to persist the partition parameters after resetting the column stats

@@ -10363,6 +10365,8 @@ private boolean deleteTableColumnStatisticsViaJdo(String catName, String dbName,
}
mStatsObjColl = (List<MTableColumnStatistics>) query.executeWithArray(params.toArray());
pm.retrieveAll(mStatsObjColl);
Long tableID = directSql.getTableId(dbName, tableName);
Copy link
Member

@dengzhhu653 dengzhhu653 May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any reason to mix the direct sql with the jdo together? this deleteTableColumnStatisticsViaJdo is for jdo query I guess. And we need to take care of the same in getSqlResult path.

Copy link

@dengzhhu653
Copy link
Member

any ideas? cc @saihemanth-cloudera @soumyakanti3578

@@ -1105,7 +1105,6 @@ Retention: 0
#### A masked pattern was here ####
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE {\"BASIC_STATS\":\"true\"}
Copy link
Contributor

@soumyakanti3578 soumyakanti3578 Jun 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that we are removing COLUMN_STATS_ACCURATE here, however, in the corresponding q file, I see this comment:

-- rename a partition should not change its table, partition, and column stats
alter table statsdb1.testpart1 partition (part = 'part1') rename to partition (part = 'part11');
describe formatted statsdb1.testpart1;

I just want us to be sure that this is what we intend to do as I see several other similar changes just below which were due to renaming/replacing columns. And if the tests don't make sense any more maybe we should consider updating the tests.

@@ -1476,7 +1471,7 @@ Database: statsdb1
Table: testpart1
#### A masked pattern was here ####
Partition Parameters:
COLUMN_STATS_ACCURATE {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"col1\":\"true\",\"col2\":\"true\"}}
COLUMN_STATS_ACCURATE {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we are printing empty COLUMN_STATS_ACCURATE. If it is easy to remove them, maybe we should do that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants