-
Notifications
You must be signed in to change notification settings - Fork 15.3k
feat: Agentic Query In Dashboard #32649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: Agentic Query In Dashboard #32649
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review by Korbit AI
Korbit automatically attempts to detect when you fix issues in new commits.
Category | Issue | Fix Detected |
---|---|---|
Invalid JSON Extraction ▹ view | ||
Unsafe List Extraction Logic ▹ view | ||
Misleading Function Name ▹ view | ||
Missing Authorization Check for Chart Data Access ▹ view | ||
Unclear Variable Name ▹ view | ||
Incomplete Tool List in Prompt Template ▹ view | ✅ | |
Invalid timestamp truncation ▹ view | ||
Incorrect Return Type Annotation ▹ view | ||
Inconsistent Placeholder Values ▹ view | ||
Invalid AI Model Name ▹ view |
Files scanned
File Path | Reviewed |
---|---|
superset-frontend/src/dashboard/actions/dashboardAgentQuery.js | ✅ |
superset/dashboards/dashboard_agentic_query/unix_timestamp_to_human_readable.py | ✅ |
superset/dashboards/dashboard_agentic_query/get_all_charts_name.py | ✅ |
superset/dashboards/dashboard_agentic_query/get_chart_data.py | ✅ |
superset/dashboards/dashboard_agentic_query/utils.py | ✅ |
superset-frontend/packages/superset-ui-core/src/utils/featureFlags.ts | ✅ |
docker/pythonpath_dev/superset_config.py | ✅ |
superset/dashboards/dashboard_agentic_query/dashboard_agentic_query.py | ✅ |
superset-frontend/src/dashboard/components/DashboardGrid.jsx | ✅ |
superset/dashboards/api.py | ✅ |
Explore our documentation to understand the languages and file types we support and the files we ignore.
Need a new review? Comment
/korbit-review
on this PR and I'll review your latest changes.Korbit Guide: Usage and Customization
Interacting with Korbit
- You can manually ask Korbit to review your PR using the
/korbit-review
command in a comment at the root of your PR.- You can ask Korbit to generate a new PR description using the
/korbit-generate-pr-description
command in any comment on your PR.- Too many Korbit comments? I can resolve all my comment threads if you use the
/korbit-resolve
command in any comment on your PR.- On any given comment that Korbit raises on your pull request, you can have a discussion with Korbit by replying to the comment.
- Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.
Customizing Korbit
- Check out our docs on how you can make Korbit work best for you and your team.
- Customize Korbit for your organization through the Korbit Console.
Feedback and Support
from superset.dashboards.dashboard_agentic_query.utils import refactor_input, extract_int_if_possible | ||
from superset.charts.schemas import ChartEntityResponseSchema | ||
|
||
def get_charts_list(dashboard_id) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect Return Type Annotation 
Tell me more
What is the issue?
The return type annotation is incorrect. The function returns a list of dictionaries, not a string.
Why this matters
This incorrect type hint could mislead other developers and cause type checking tools to fail. The function actually returns a list of dictionaries containing chart information.
Suggested change ∙ Feature Preview
Correct the return type annotation to List[Dict[str, Union[str, int]]]:
from typing import List, Dict, Union
def get_charts_list(dashboard_id) -> List[Dict[str, Union[str, int]]]:
💬 Looking for more details? Reply to this comment to chat with Korbit.
if(len(unix_timestamp) > 10): | ||
unix_timestamp = unix_timestamp[:10] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Invalid timestamp truncation 
Tell me more
What is the issue?
Arbitrary truncation of timestamps longer than 10 digits could lead to incorrect date conversions
Why this matters
This can result in wrong date calculations for timestamps after September 2001 (timestamps > 999999999) or for millisecond precision timestamps
Suggested change ∙ Feature Preview
Handle different timestamp precisions properly:
def convert_unix_to_human_readable(unix_timestamp):
unix_timestamp = refactor_input(unix_timestamp)
unix_timestamp = extract_int_if_possible(unix_timestamp)
# Handle milliseconds precision
if len(str(unix_timestamp)) > 10:
unix_timestamp = int(unix_timestamp) / 1000
else:
unix_timestamp = int(unix_timestamp)
💬 Looking for more details? Reply to this comment to chat with Korbit.
def extract_json_from_string(input_str): | ||
start_open_curly = -1 | ||
end_close_curly = -1 | ||
|
||
for i in range(len(input_str)): | ||
if(input_str[i] == '{' and start_open_curly == -1): | ||
start_open_curly = i | ||
elif(input_str[i] == '}'): | ||
end_close_curly = i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Invalid JSON Extraction 
Tell me more
What is the issue?
The function doesn't handle nested JSON objects correctly and can return invalid JSON if the input contains multiple closing braces.
Why this matters
When processing complex dashboard data that contains nested JSON structures, this can lead to corrupted data being returned to the agentic query feature.
Suggested change ∙ Feature Preview
def extract_json_from_string(input_str):
try:
start = input_str.find('{')
if start == -1:
raise ValueError("No JSON object found")
brace_count = 0
for i in range(start, len(input_str)):
if input_str[i] == '{':
brace_count += 1
elif input_str[i] == '}':
brace_count -= 1
if brace_count == 0:
# Validate the extracted JSON
result = input_str[start:i + 1]
json.loads(result) # Will raise JSONDecodeError if invalid
return result
raise ValueError("Unmatched braces")
except json.JSONDecodeError:
raise ValueError("Invalid JSON structure")
💬 Looking for more details? Reply to this comment to chat with Korbit.
Question: the input question you must answer | ||
Thought: you should always think about what to do | ||
Action: the action to take, should be one of [get_charts_list, get_chart_data] |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
|
||
DASHBOARD_AGENTIC_QUERY_CONFIG = current_app.config["DASHBOARD_AGENTIC_QUERY_CONFIG"] | ||
|
||
mrkl_template_value = ''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unclear Variable Name 
Tell me more
What is the issue?
The variable name 'mrkl_template_value' is unclear and uses an unexplained abbreviation.
Why this matters
Cryptic variable names force readers to research or guess their meaning, slowing down code comprehension.
Suggested change ∙ Feature Preview
agent_prompt_template = '''
💬 Looking for more details? Reply to this comment to chat with Korbit.
def correct_lst(input_str): | ||
first_open = -1 | ||
last_close = -1 | ||
for i in range(len(input_str)): | ||
if(input_str[i]=='[' and first_open==-1): | ||
first_open = i | ||
elif(input_str[i]==']'): | ||
last_close = i | ||
|
||
return input_str[first_open : last_close + 1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsafe List Extraction Logic 
Tell me more
What is the issue?
The function fails to handle cases where no brackets are found (first_open or last_close remain -1) or where there are nested lists.
Why this matters
This can lead to IndexError exceptions or incorrect list extraction when processing nested data structures, which is critical for dashboard data processing.
Suggested change ∙ Feature Preview
def correct_lst(input_str):
if not input_str:
raise ValueError("Empty input string")
first_open = input_str.find('[')
if first_open == -1:
raise ValueError("No opening bracket found")
bracket_count = 0
for i in range(first_open, len(input_str)):
if input_str[i] == '[':
bracket_count += 1
elif input_str[i] == ']':
bracket_count -= 1
if bracket_count == 0:
return input_str[first_open:i + 1]
raise ValueError("Unmatched brackets")
💬 Looking for more details? Reply to this comment to chat with Korbit.
def get_chart_data(chart_id) -> str: | ||
chart_id = refactor_input(chart_id) | ||
chart_id = extract_int_if_possible(chart_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing Authorization Check for Chart Data Access 
Tell me more
What is the issue?
The chart_id input validation relies on external utility functions without clear validation guarantees. The code doesn't explicitly verify if the user has permission to access the requested chart data.
Why this matters
Without proper authorization checks, users might be able to access chart data they shouldn't have permission to view. This could lead to data exposure vulnerabilities.
Suggested change ∙ Feature Preview
def get_chart_data(chart_id) -> str:
# Add authorization check before processing
if not current_user.has_access_to_chart(chart_id):
raise PermissionError('User not authorized to access this chart')
chart_id = refactor_input(chart_id)
chart_id = extract_int_if_possible(chart_id)
💬 Looking for more details? Reply to this comment to chat with Korbit.
input_str = input_str[:-1] | ||
return input_str | ||
|
||
def extract_int_if_possible(input_str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Misleading Function Name 
Tell me more
What is the issue?
Misleading function name as it doesn't actually extract or convert to an integer
Why this matters
The function name suggests integer extraction but it only performs string splitting, leading to confusion for maintainers.
Suggested change ∙ Feature Preview
def extract_value_after_delimiter(input_str):
💬 Looking for more details? Reply to this comment to chat with Korbit.
"api_key": "-", | ||
"base_url": "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent Placeholder Values 
Tell me more
What is the issue?
Placeholder values for sensitive configuration use non-standard conventions.
Why this matters
Using '-' and '' as placeholders is inconsistent and may be confused with valid values. This makes it unclear whether these are intentionally empty or missing values.
Suggested change ∙ Feature Preview
"api_key": None, # Required: Set via environment variable
"base_url": None, # Required: Set via environment variable
💬 Looking for more details? Reply to this comment to chat with Korbit.
FEATURE_FLAGS = {"ALERT_REPORTS": True, "SHOW_DASHBOARD_AGENT_QUERY": False} | ||
|
||
DASHBOARD_AGENTIC_QUERY_CONFIG = { | ||
"model": "gpt-4o", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Invalid AI Model Name 
Tell me more
What is the issue?
The model name 'gpt-4o' appears to be incorrect as it's not a standard OpenAI model name (should probably be 'gpt-4' or 'gpt-3.5-turbo')
Why this matters
Using an invalid model name will cause API calls to fail, preventing the Agentic Query feature from functioning
Suggested change ∙ Feature Preview
Correct the model name to a valid OpenAI model identifier:
"model": "gpt-4", # or gpt-3.5-turbo depending on requirements
💬 Looking for more details? Reply to this comment to chat with Korbit.
I think this is an interesting proof of concept, but to me it raises more questions than it provides answers. Raising some of the questions here:
My intuition around much potential AI features - given that there are a hundred ways to implement them depending on prefs / environment constraints - and that design patterns (both UX and backend-wise) are still evolving rapidly - is that they probably belong in extensions, at least for the time being. |
|
@mistercrunch I have addressed your queries above, |
@rusackas @mistercrunch |
I'll give you my take, but curious to hear from others too. Here are a few structured-with-AI thoughts: 1. If in Superset, LLM config should be modular and first-classWhat we need here goes way beyond a single config block in Feels like LLM configs should be treated more like database connections: centrally defined and then mapped to features. We'd need:
2. Feature flags aren’t the right control planeThe current pattern of scattering feature flags for every AI experiment isn't going to scale. In theory, FFs were designed for progressive rollouts of “forever features,” but in practice they've already become a de facto config framework. What we probably need is something more intentional: a centralized
3. Superset already has a huge product surfaceWe have a lot of ground to cover already, and anything we land here we’re likely stuck supporting for the next decade. Every new AI integration adds cognitive and maintenance load in a fast-moving space with fragile dependencies and shifting UX norms. That’s a real risk, especially when the ecosystem and design patterns are changing daily 4. The right place for this is the extension frameworkThis is exactly the kind of thing that should live in our (in-dev) extension framework. That would let us:
If Superset becomes more like VSCode—with rich, modular extensions—this type of feature becomes additive instead of burdensome. 5. The way I think about it: MCP is the path forwardMCP is the cleanest way to support AI in a scalable, deterministic way. Instead of trying to bolt LLMs into Superset, we focus on exposing a solid API that external agents can use. The flow becomes:
And the LLM uses MCP calls to introspect metadata, generate SQL, request charts, etc. Superset stays composable, auditable, and predictable—and the fuzzy magic stays outside the boundary. 6. A backend-up abstraction could work wellI’d be happy to help shape a backend-first approach - ideally through an extension:
That gives us a clean, layered system—one that avoids LangChain/LLM lock-in, and gives orgs real flexibility. 7. AI testing is still shakyWorth calling out: we don’t have great patterns yet for testing these kinds of flows. Even small prompt tweaks or model changes can create huge diffs in output. The more this stuff ends up in core, the more brittle our test matrix becomes. TL;DR: This is a super cool POC, and the use case is legit—but we’re not ready to enshrine this in core. Let’s use this as a proof point for the extension framework, build modular LLM abstractions, and anchor our long-term AI strategy around MCP. That path gives us power and flexibility, without locking us into early decisions we’ll regret. |
Side note: given the SIP on MCP #33870, wondering if some of the use cases could be satisfied through MCP. Sure it's not the same experience with having a chart analysis fully done right-in-context in the UI with a nice popover embedded straight in the right place, but from my early testing of what Anyhow, ultimately I think both RAG and MCP approaches are useful and valid, but MCP is so much more "our lane" to support. |
Worked with gpt as to what a foundational, backend-first extension could look like. Somehow feels like tightly coupling with 🧠 Superset Extension:
|
SUMMARY
This PR is to add LLM agentic query search button at the top of the dashboard if the feature flag is set to
True
SIP: #32650
I have added a search button at the top of the dashboard, allowing users to ask any query related to the dashboard, charts, anomalies, summaries, etc. When the button is clicked, it will perform an LLM agent query and return the query response in a modal.
By default, the search button will not be visible. To enable it, set the
SHOW_DASHBOARD_AGENT_QUERY
feature flag to True and update the OpenAI API key in superset_config. If you are using a locally hosted LLM, update the base URL instead.Use cases:
False
if they don't want to use the feature.Screenshots
When Feature Flag is
False
i.e default i.e no change exactly like original supersetWhen Feature Flag is
True
CLI logs