Skip to content

[TT-11048] Regex on endpoint list issue #7146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 27 commits into from

Conversation

shults
Copy link
Contributor

@shults shults commented Jun 25, 2025

User description

TT-11048
Summary Regex on endpoint list issue
Type Bug Bug
Status In Dev
Points N/A
Labels codilime_refined, jira_escalated

Description

Related Issue

Motivation and Context

How This Has Been Tested

Screenshots (if appropriate)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Refactoring or add test (improvements in base code or adds test coverage to functionality)

Checklist

  • I ensured that the documentation is up to date
  • I explained why this PR updates go.mod in detail with reasoning why it's required
  • I would like a code coverage CI quality gate exception and have explained why

PR Type

Enhancement, Tests


Description

  • Introduce robust path normalization for OAS endpoint paths

  • Refactor OAS operation path parsing and normalization logic

  • Add comprehensive tests for path normalization and parser

  • Enhance internal reflection clone test coverage


Changes walkthrough 📝

Relevant files
Enhancement
6 files
oas.go
Add OAS path normalization and validation improvements     
+51/-3   
operation.go
Refactor operation path parsing and mock response extraction
+77/-118
validator.go
Add extended OAS validation options                                           
+17/-0   
option.go
Add Prepend method for options utility                                     
+7/-0     
parser.go
Introduce robust parser for user-defined OAS paths             
+428/-0 
path.go
Implement normalized OAS path representation and normalization logic
+94/-0   
Tests
4 files
parser_test.go
Add comprehensive tests for path parser and normalization
+162/-0 
oas_test.go
Update OAS struct initialization for normalization flag   
+1/-0     
validator_test.go
Update OAS test struct for normalization compatibility     
+1/-0     
clone_test.go
Add deep pointer and map clone tests for reflect utility 
+46/-0   
Additional files
1 files
operation_test.go +1/-1     

Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • @buger
    Copy link
    Member

    buger commented Jun 25, 2025

    I'm a bot and I 👍 this PR title. 🤖

    @buger
    Copy link
    Member

    buger commented Jun 25, 2025

    This PR is too huge for one to review 💔

    Additions 885 🙅‍♀️
    Expected ⬇️ 800

    Consider breaking it down into multiple small PRs.

    Check out this guide to learn more about PR best-practices.

    Copy link
    Contributor

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 5 🔵🔵🔵🔵🔵
    🧪 PR contains tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Parser Robustness

    The new path parser introduces complex logic for handling various path patterns and regular expressions. The reviewer should carefully validate edge cases, error handling, and ensure that all expected path formats (including malformed or ambiguous ones) are handled gracefully without panics or silent failures.

    package pathnormalizer
    
    import (
    	"errors"
    	"fmt"
    	"regexp"
    	"strings"
    
    	"github.com/TykTechnologies/tyk/common/option"
    	"github.com/getkin/kin-openapi/openapi3"
    )
    
    const (
    	slash                 = '/'
    	curlyBraceLeft        = '{'
    	curlyBraceRight       = '}'
    	muxIdPatternSeparator = ':'
    
    	// RePrefix default regexp prefix.
    	RePrefix            = "customRegex"
    	DefaultNonDefinedRe = "[^/]+"
    )
    
    var (
    	ErrUnreachableCase  = errors.New("unreachable case")
    	ErrUnexpectedSlash  = errors.New("unexpected slash symbol in pattern")
    	ErrUnexpectedSymbol = errors.New("unexpected symbol")
    )
    
    // Parser responsible for parsing user-defined path of given OAS.
    type Parser struct {
    	anonymousReCounter int
    	prefix             string
    	stripSlashes       bool
    	ctrResets          bool
    }
    
    // WithNoStripSlashes skip strip slashes.
    // Removes slashes near each other.
    func WithNoStripSlashes() option.Option[Parser] {
    	return func(parser *Parser) {
    		parser.stripSlashes = false
    	}
    }
    
    // WithPrefix allows to use custom prefix.
    func WithPrefix(prefix string) option.Option[Parser] {
    	return func(parser *Parser) {
    		parser.prefix = prefix
    	}
    }
    
    // WithCtrResets enables counter resets on each parse call.
    // Useful for testing.
    func WithCtrResets() option.Option[Parser] {
    	return func(parser *Parser) {
    		parser.ctrResets = true
    	}
    }
    
    // NewParser instantiates new parser instance
    func NewParser(opts ...option.Option[Parser]) *Parser {
    	return option.New(opts).Build(Parser{
    		anonymousReCounter: 0,
    		prefix:             RePrefix,
    		stripSlashes:       true,
    		ctrResets:          false,
    	})
    }
    
    // Parse responsible for parsing next one path.
    func (p *Parser) Parse(path string) (*NormalizedPath, error) {
    	if p.ctrResets {
    		defer func() {
    			p.anonymousReCounter = 0
    		}()
    	}
    
    	singlePathParser := newPathParser(path, p)
    
    	pPath, err := singlePathParser.parse()
    
    	if err != nil {
    		return nil, err
    	}
    
    	return newNormalizedPath(pPath), nil
    }
    
    type pathParser struct {
    	src    string
    	pos    int
    	parent *Parser
    }
    
    func newPathParser(path string, parent *Parser) *pathParser {
    	return &pathParser{
    		src:    path,
    		pos:    0,
    		parent: parent,
    	}
    }
    
    func (p *pathParser) parse() ([]pathPart, error) {
    	var parts []pathPart
    
    	for p.pos < len(p.src) {
    		ch := p.src[p.pos]
    		var err error
    		var part pathPart
    
    		switch {
    		case ch == slash:
    			if p.parent.stripSlashes {
    				p.consumeAll(slash)
    			} else {
    				p.consumeOne(slash)
    			}
    
    			part = newPathPartSplitter()
    		case ch == curlyBraceLeft:
    			part, err = p.parseMuxRe()
    		default:
    			if id, ok := p.consumeStaticPart(); ok {
    				part = newPathPartRaw(id)
    				break
    			}
    
    			part, err = p.parseAnonymousRe()
    		}
    
    		if err != nil {
    			return nil, err
    		}
    
    		parts = append(parts, part)
    	}
    
    	return parts, nil
    }
    
    func (p *pathParser) consumeMuxIdentifier() (res string, ok bool, finished bool) {
    	if !isLetter(p.src[p.pos]) {
    		return
    	}
    
    	start := p.pos
    	for p.pos < len(p.src) && isMuxIdentifierSymbol(p.src[p.pos]) {
    		p.pos++
    	}
    
    	// end reached
    	if p.pos == len(p.src) {
    		p.pos = start
    		return
    	}
    
    	if p.src[p.pos] == muxIdPatternSeparator {
    		p.pos++
    		return p.src[start : p.pos-1], true, false
    	}
    
    	if p.src[p.pos] == curlyBraceRight {
    		p.pos++
    		return p.src[start : p.pos-1], true, true
    	}
    
    	p.pos = start
    	return
    }
    
    func (p *pathParser) parseMuxRe() (pathPart, error) {
    	if p.src[p.pos] != curlyBraceLeft {
    		return pathPart{}, ErrUnreachableCase
    	}
    
    	// consume first open curly brace
    	p.pos++
    	openBraceCtr := 1
    
    	paramName, idParsedOk, isFinished := p.consumeMuxIdentifier()
    
    	if isFinished {
    		return p.newPathPartRe(paramName, DefaultNonDefinedRe)
    	}
    
    	start := p.pos
    loop:
    	for p.pos < len(p.src) {
    		ch := p.src[p.pos]
    		p.pos++
    
    		switch ch {
    		case slash:
    			p.pos--
    			break loop
    		case curlyBraceLeft:
    			openBraceCtr++
    		case curlyBraceRight:
    			openBraceCtr--
    			if openBraceCtr == 0 {
    				// if next exists it should be /
    				if p.pos < len(p.src) && p.src[p.pos] != slash {
    					return pathPart{}, p.parseError(ErrUnexpectedSymbol)
    				}
    
    				break loop
    			}
    		default:
    		}
    	}
    
    	if openBraceCtr != 0 {
    		p.pos--
    		return pathPart{}, p.parseError(ErrUnexpectedSlash)
    	}
    
    	if !idParsedOk {
    		paramName = p.newAnonymousName()
    	}
    
    	return p.newPathPartRe(paramName, p.src[start:p.pos-1])
    }
    
    func (p *pathParser) parseError(err error) parseError {
    	return parseError{
    		prev: err,
    		pos:  p.pos,
    		src:  p.src,
    	}
    }
    
    func (p *pathParser) parseAnonymousRe() (pathPart, error) {
    	paramName, idParsedOk, isFinished := p.consumeMuxIdentifier()
    
    	if isFinished {
    		return pathPart{}, p.parseError(ErrUnreachableCase)
    	}
    
    	start := p.pos
    	braceCtr := 0
    
    loop:
    	for p.pos < len(p.src) {
    		ch := p.src[p.pos]
    		p.pos++
    
    		switch ch {
    		case slash:
    			p.pos--
    			break loop
    		case curlyBraceLeft:
    			braceCtr++
    		case curlyBraceRight:
    			braceCtr--
    		}
    	}
    
    	if braceCtr != 0 {
    		return pathPart{}, p.parseError(ErrUnexpectedSlash)
    	}
    
    	if !idParsedOk {
    		paramName = p.newAnonymousName()
    	}
    
    	return p.newPathPartRe(
    		paramName,
    		p.src[start:p.pos],
    	)
    }
    
    func (p *pathParser) consumeAll(ch byte) {
    	for p.consumeOne(ch) {
    	}
    }
    
    func (p *pathParser) consumeOne(ch byte) bool {
    	if p.pos < len(p.src) && p.src[p.pos] == ch {
    		p.pos++
    		return true
    	}
    
    	return false
    }
    
    func (p *pathParser) consumeStaticPart() (string, bool) {
    	var b []byte
    	var pos = p.pos
    
    loop:
    	for pos < len(p.src) {
    		ch := p.src[pos]
    		pos++
    
    		switch {
    		case ch == slash:
    			pos--
    			break loop
    		case !isIdentifierSymbol(ch):
    			return "", false
    		default:
    			b = append(b, ch)
    		}
    	}
    
    	p.pos = pos
    	return string(b), len(b) > 0
    }
    
    func (p *pathParser) newAnonymousName() string {
    	p.parent.anonymousReCounter++
    	return fmt.Sprintf("%s%d", p.parent.prefix, p.parent.anonymousReCounter)
    }
    
    func (p *pathParser) newPathPartRe(name, pattern string) (pathPart, error) {
    	if _, err := regexp.Compile(fmt.Sprintf("^%s$", pattern)); err != nil {
    		return pathPart{}, p.parseError(err)
    	}
    
    	return pathPart{
    		name:    name,
    		pattern: pattern,
    		typ:     pathPartRe,
    		parameter: openapi3.
    			NewPathParameter(name).
    			WithSchema(openapi3.NewStringSchema().WithPattern(pattern)),
    	}, nil
    }
    
    type parseError struct {
    	prev error
    	src  string
    	pos  int
    }
    
    func (e parseError) Error() string {
    	baseMsg := fmt.Sprintf(`failed to parse "%s" at %d`, e.src, e.pos)
    
    	if e.prev == nil {
    		return baseMsg
    	}
    
    	return fmt.Sprintf(`%s: %s`, baseMsg, e.prev.Error())
    }
    
    type pathPartType struct {
    	val string
    }
    
    var (
    	pathPartRe       = pathPartType{"reg-exp"}
    	pathPartRaw      = pathPartType{"raw"}
    	pathPartSplitter = pathPartType{"/"}
    )
    
    // pathPart regular expression part.
    type pathPart struct {
    	name      string
    	pattern   string
    	parameter *openapi3.Parameter
    	typ       pathPartType
    }
    
    func newPathPartSplitter() pathPart {
    	return pathPart{
    		typ: pathPartSplitter,
    	}
    }
    
    func newPathPartRaw(name string) pathPart {
    	return pathPart{
    		name: name,
    		typ:  pathPartRaw,
    	}
    }
    
    func (r *pathPart) normalize() string {
    	switch r.typ {
    	case pathPartSplitter:
    		return "/"
    	case pathPartRaw:
    		return r.name
    	case pathPartRe:
    		var sb strings.Builder
    		sb.Grow(len(r.name) + 2)
    		sb.WriteRune(curlyBraceLeft)
    		sb.WriteString(r.name)
    		sb.WriteRune(curlyBraceRight)
    		return sb.String()
    	default:
    		panic("invalid path part type")
    	}
    }
    
    func isIdentifierSymbol(s byte) bool {
    	return isLowerLetter(s) || isUpperLetter(s) || isDigit(s) || isOneOf(s, "._-")
    }
    
    func isMuxIdentifierSymbol(s byte) bool {
    	return isLowerLetter(s) || isUpperLetter(s) || isDigit(s) || isOneOf(s, "_-")
    }
    
    func isLetter(s byte) bool {
    	return isLowerLetter(s) || isUpperLetter(s)
    }
    
    func isLowerLetter(s byte) bool {
    	return s >= 'a' && s <= 'z'
    }
    
    func isUpperLetter(s byte) bool {
    	return s >= 'A' && s <= 'Z'
    }
    
    func isDigit(s byte) bool {
    	return s >= '0' && s <= '0'
    }
    
    func isOneOf(s byte, str string) bool {
    	for _, r := range []byte(str) {
    		if r == s {
    			return true
    		}
    	}
    
    	return false
    }
    Validation and Normalization Changes

    The OAS validation logic now normalizes paths before validation and logs detailed errors. The reviewer should ensure that this normalization does not introduce regressions or unintended side effects, especially for APIs with complex or legacy path definitions.

    	normalized, err := s.newNormalized()
    
    	if err != nil {
    		return err
    	}
    
    	if err = normalized.T.Validate(ctx, opts...); err != nil {
    		origin, err2 := json.Marshal(s.T)
    		modified, err3 := json.Marshal(normalized.T)
    
    		log.
    			WithError(errors.Join(err, err2, err3)).
    			WithField("origin", string(origin)).
    			WithField("oas", string(modified)).
    			Errorf("failed to validate OAS")
    
    		return err
    	}
    
    	// todo: replace as it was (join errors)
    	return normalized.validateSecurity()
    }
    Operation ID Generation Refactor

    The logic for generating operation IDs and handling path parameters has been refactored. The reviewer should verify that operation IDs remain unique and consistent, and that path parameter extraction works as expected for all supported path formats.

    func (s *OAS) getOrCreatePathItem(item string) *openapi3.PathItem {
    	if s.Paths.Value(item) == nil {
    		s.Paths.Set(item, &openapi3.PathItem{})
    	}
    
    	return s.Paths.Value(item)
    }
    
    func createOrUpdateOperation(
    	pathItem *openapi3.PathItem,
    	method, operationID string,
    ) *openapi3.Operation {
    
    	operation := pathItem.GetOperation(method)
    
    	if operation == nil {
    		operation = &openapi3.Operation{
    			Responses: openapi3.NewResponses(),
    		}
    
    		pathItem.SetOperation(method, operation)
    	}
    
    	if operation.OperationID == "" {
    		operation.OperationID = operationID
    	}
    
    	return operation
    }
    
    // todo: OperationId => ValueObject
    func makeOperationId(inPath, method string) string {
    	return strings.TrimPrefix(inPath, "/") + strings.ToUpper(method)
    }
    
    // getOperationID side effect method.
    // Deprecated cause of side effects.
    func (s *OAS) getOperationID(inPath, method string) string {
    	return createOrUpdateOperation(s.getOrCreatePathItem(inPath), method, makeOperationId(inPath, method)).OperationID
    }

    Copy link
    Contributor

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Fix digit detection logic

    The isDigit function incorrectly checks if the byte is exactly '0', rather than any
    digit. This will cause digit detection to fail for all digits except '0'. Update the
    range to cover all ASCII digits.

    internal/pathnormalizer/parser.go [416-418]

     func isDigit(s byte) bool {
    -	return s >= '0' && s <= '0'
    +	return s >= '0' && s <= '9'
     }
    Suggestion importance[1-10]: 9

    __

    Why: The original isDigit implementation only returns true for '0', which is incorrect and would break any logic relying on digit detection; correcting this to cover '0'-'9' is critical for proper path parsing and normalization.

    High

    Copy link
    Contributor

    github-actions bot commented Jun 25, 2025

    API Changes

    --- prev.txt	2025-07-10 07:42:13.329705936 +0000
    +++ current.txt	2025-07-10 07:42:03.922818810 +0000
    @@ -3556,6 +3556,7 @@
     
     type OAS struct {
     	openapi3.T
    +	// Has unexported fields.
     }
         OAS holds the upstream OAS definition as well as adds functionality like
         custom JSON marshalling.
    @@ -3574,10 +3575,10 @@
     func (s *OAS) Clone() (*OAS, error)
         Clone creates a deep copy of the OAS object and returns a new instance.
     
    -func (s *OAS) ExtractTo(api *apidef.APIDefinition)
    +func (s *OAS) ExtractTo(api *apidef.APIDefinition) error
         ExtractTo extracts *OAS into *apidef.APIDefinition.
     
    -func (s *OAS) Fill(api apidef.APIDefinition)
    +func (s *OAS) Fill(api apidef.APIDefinition) error
         Fill fills *OAS definition from apidef.APIDefinition.
     
     func (s *OAS) GetTykExtension() *XTykAPIGateway
    @@ -3591,6 +3592,10 @@
     func (s *OAS) MarshalJSON() ([]byte, error)
         MarshalJSON implements json.Marshaller.
     
    +func (s *OAS) Normalize() error
    +    Normalize normalizes path to acceptable form by openapi. If API has named
    +    RegExp it will be converted to accepted form of api by OAS.
    +
     func (s *OAS) RemoveTykExtension()
         RemoveTykExtension clears the Tyk extensions from *OAS.
     
    @@ -5508,9 +5513,14 @@
     type Options[O any] []Option[O]
     
     func New[O any](opts []Option[O]) Options[O]
    +    New options instance.
     
     func (o Options[O]) Build(baseVal O) *O
     
    +func (o Options[O]) Prepend(opts ...Option[O]) Options[O]
    +    Prepend adds options at the beginning of the list. Can be used for adding
    +    default options.
    +
     # Package: ./config
     
     package config // import "github.com/TykTechnologies/tyk/config"
    @@ -10472,12 +10482,6 @@
     
     func (k *OpenIDMW) ProcessRequest(w http.ResponseWriter, r *http.Request, _ interface{}) (error, int)
     
    -type Operation struct {
    -	*oas.Operation
    -
    -	// Has unexported fields.
    -}
    -
     type OrganizationMonitor struct {
     	*BaseMiddleware
     
    @@ -11719,7 +11723,7 @@
     
     func (k *ValidateRequest) Name() string
     
    -func (k *ValidateRequest) ProcessRequest(w http.ResponseWriter, r *http.Request, _ interface{}) (error, int)
    +func (k *ValidateRequest) ProcessRequest(_ http.ResponseWriter, r *http.Request, _ interface{}) (error, int)
         ProcessRequest will run any checks on the request on the way through the
         system, return an error to have the chain fail
     

    @shults shults requested review from andrei-tyk and MaciekMis and removed request for andrei-tyk June 30, 2025 10:03
    @shults shults requested a review from andrei-tyk July 4, 2025 13:10
    @shults
    Copy link
    Contributor Author

    shults commented Jul 4, 2025

    /probe review

    Copy link
    Contributor

    github-actions bot commented Jul 4, 2025

    Code Review: Regex on Endpoint List Issue (PR #7146)

    Summary

    This PR addresses a critical bug in the path normalization functionality of Tyk Gateway, specifically in the handling of regex patterns in endpoint paths. The core issue is in the isDigit function which incorrectly recognizes only the digit '0' and fails to recognize digits 1-9.

    Critical Issues

    Bug in Digit Detection

    The isDigit function in internal/pathnormalizer/parser.go has a critical bug:

    func isDigit(s byte) bool {
        return s >= '0' && s <= '0'  // Bug: Only recognizes '0' as a digit
    }

    This function is supposed to check if a byte is a digit (0-9), but it's incorrectly checking if the byte is exactly '0' (since the range is from '0' to '0'). This means that only the character '0' will be recognized as a digit, and all other digits ('1' through '9') will not be recognized.

    The PR correctly fixes this by changing the upper bound from '0' to '9':

    func isDigit(s byte) bool {
        return s >= '0' && s <= '9'  // Fixed: Recognizes all digits 0-9
    }

    Impact of the Bug

    This bug has significant implications:

    1. Paths containing digits other than '0' would not be properly parsed
    2. Regex patterns in endpoint paths that include digits would fail
    3. Path normalization would produce incorrect results for paths with digits 1-9
    4. API definitions with regex patterns containing digits would not work as expected

    For example, a path like /users/123 or a regex pattern like id:[0-9]+ would not be properly parsed because '1', '2', '3', etc. wouldn't be recognized as digits.

    PR Improvements

    Beyond fixing the core bug, this PR introduces several improvements:

    1. A robust path normalization system for OAS endpoint paths
    2. A new parser for user-defined OAS paths that handles various formats
    3. Proper conversion between Tyk's extended path format and OpenAPI's path format
    4. Comprehensive test coverage for path normalization and parsing

    The path normalization is particularly important because it allows Tyk to support both its own extended path format (which includes regex patterns) and the standard OpenAPI path format.

    Conclusion

    This PR addresses a critical bug that would affect any API definition using regex patterns with digits in endpoint paths. The fix is minimal and focused on the specific issue, while also introducing additional improvements to make the path normalization system more robust. The comprehensive test coverage ensures that the fix works correctly for various path formats and edge cases.


    Tip: Mention me again using /probe <request>.
    Powered by Probe AI

    Copy link
    Contributor

    Analysis of PR #7146: Regex on Endpoint List Issue

    Summary

    This PR addresses a critical bug in the path normalization functionality of Tyk Gateway, specifically in the handling of regex patterns in endpoint paths. The core issue is in the isDigit function which incorrectly recognizes only the digit '0' and fails to recognize digits 1-9.

    The Bug

    The root cause of the issue is in the isDigit function in internal/pathnormalizer/parser.go:

    func isDigit(s byte) bool {
        return s >= '0' && s <= '0'  // Bug: Only recognizes '0' as a digit
    }

    This function is supposed to check if a byte is a digit (0-9), but it's incorrectly checking if the byte is exactly '0' (since the range is from '0' to '0'). This means that only the character '0' will be recognized as a digit, and all other digits ('1' through '9') will not be recognized.

    The fix is straightforward:

    func isDigit(s byte) bool {
        return s >= '0' && s <= '9'  // Fixed: Recognizes all digits 0-9
    }

    Impact of the Bug

    This bug has significant implications:

    1. Paths containing digits other than '0' would not be properly parsed
    2. Regex patterns in endpoint paths that include digits would fail
    3. Path normalization would produce incorrect results for paths with digits 1-9
    4. API definitions with regex patterns containing digits would not work as expected

    For example, a path like /users/123 or a regex pattern like id:[0-9]+ would not be properly parsed because '1', '2', '3', etc. wouldn't be recognized as digits.

    Key Changes

    1. Robust Path Normalization

    The PR introduces a comprehensive path normalization system:

    • A new Parser for handling various path formats
    • A NormalizedPath representation for standardized paths
    • A Mapper to handle the mapping between different path formats
    • Proper handling of path parameters and regex patterns

    2. Refactored Operation Path Parsing

    • Improved logic for parsing operation paths
    • Better handling of path parameters
    • More robust error handling
    • Replaced the old regex path detection with a more robust parser

    3. Enhanced Validation

    • Added extended OAS validation options
    • Improved error reporting for validation failures
    • Normalized paths before validation
    • Better handling of path collisions

    4. Comprehensive Test Coverage

    • Added extensive tests for path normalization and parsing
    • Tests for various path formats and edge cases
    • Tests for regex patterns with digits
    • Tests for path parameter extraction

    Code Quality Improvements

    1. Error Handling: The PR introduces better error handling with detailed error messages.
    2. Modularity: The code is more modular with clear separation of concerns.
    3. Type Safety: Improved type safety with generics and better parameter handling.
    4. Documentation: Added comprehensive documentation for new functions and types.
    5. Testing: Extensive test coverage for the new functionality.

    Potential Issues

    There are a couple of minor issues that should be addressed:

    1. In operation.go, there appears to be a typo in the filler function where methodField, ok := typ.FieldByName("Path") should be methodField, ok := typ.FieldByName("Method").
    2. In fillPathsAndOperations, the error collector's result is not being checked with collector.Err() at the end of the function.

    Conclusion

    This PR fixes a critical bug that would affect any API definition using regex patterns with digits in endpoint paths. The fix is minimal and focused on the specific issue, while also introducing additional improvements to make the path normalization system more robust. The comprehensive test coverage ensures that the fix works correctly for various path formats and edge cases.

    The PR is well-structured and includes all necessary changes to fix the issue and improve the overall code quality. The two minor issues mentioned should be addressed before merging, but they don't affect the core functionality of the fix.


    Tip: Mention me again using /dependency <request>.
    Powered by Probe AI

    Copy link
    Contributor

    Analysis of PR #7146: Regex on Endpoint List Issue

    This PR addresses a critical bug in the path normalization functionality of Tyk Gateway, specifically in the handling of regex patterns in endpoint paths.

    The Core Issue

    The bug is in the isDigit function in internal/pathnormalizer/parser.go:

    func isDigit(s byte) bool {
        return s >= '0' && s <= '0'  // Bug: Only recognizes '0' as a digit
    }

    This function is supposed to check if a byte is a digit (0-9), but it's incorrectly checking if the byte is exactly '0' (since the range is from '0' to '0'). This means that only the character '0' would be recognized as a digit, and all other digits ('1' through '9') would not be recognized.

    The PR correctly fixes this by changing the upper bound from '0' to '9':

    func isDigit(s byte) bool {
        return s >= '0' && s <= '9'  // Fixed: Recognizes all digits 0-9
    }

    Impact of the Bug

    This bug has significant implications:

    1. Paths containing digits other than '0' would not be properly parsed
    2. Regex patterns in endpoint paths that include digits would fail
    3. Path normalization would produce incorrect results for paths with digits 1-9
    4. API definitions with regex patterns containing digits would not work as expected

    For example, a path like /users/123 or a regex pattern like id:[0-9]+ would not be properly parsed because '1', '2', '3', etc. wouldn't be recognized as digits.

    PR Improvements

    Beyond fixing the core bug, this PR introduces several improvements:

    1. A robust path normalization system for OAS endpoint paths
    2. A new parser for user-defined OAS paths that handles various formats
    3. Proper conversion between Tyk's extended path format and OpenAPI's path format
    4. Comprehensive test coverage for path normalization and parsing

    The path normalization is particularly important because it allows Tyk to support both its own extended path format (which includes regex patterns) and the standard OpenAPI path format.

    Security Implications

    The bug could potentially have security implications as it might cause certain path-based security rules to be bypassed if they rely on regex patterns containing digits other than '0'. The fix ensures that all path patterns are correctly parsed and normalized, maintaining the intended security constraints.


    Tip: Mention me again using /security <request>.
    Powered by Probe AI

    Copy link
    Contributor

    Analysis of PR #7146: Regex on Endpoint List Issue

    This PR addresses a critical bug in the path normalization functionality of Tyk Gateway, specifically in the handling of regex patterns in endpoint paths.

    The Bug

    The core issue is in the isDigit function in internal/pathnormalizer/parser.go:

    func isDigit(s byte) bool {
        return s >= '0' && s <= '0'  // Bug: Only recognizes '0' as a digit
    }

    This function is supposed to check if a byte is a digit (0-9), but it's incorrectly checking if the byte is exactly '0' (since the range is from '0' to '0'). This means that only the character '0' will be recognized as a digit, and all other digits ('1' through '9') will not be recognized.

    The PR fixes this by changing the upper bound from '0' to '9':

    func isDigit(s byte) bool {
        return s >= '0' && s <= '9'  // Fixed: Recognizes all digits 0-9
    }

    Impact of the Bug

    This bug has significant implications:

    1. Path Parsing Failures: Paths containing digits 1-9 would not be properly parsed because these digits wouldn't be recognized as valid identifier symbols.

    2. Regex Pattern Issues: Regex patterns in endpoint paths that include digits (like /users/{id:[0-9]+}) would fail to be properly normalized.

    3. Path Normalization Errors: The path normalization process, which converts Tyk's extended path format to the OpenAPI path format, would produce incorrect results for paths with digits 1-9.

    4. API Definition Problems: API definitions with regex patterns containing digits would not work as expected, potentially causing routing issues.

    PR Improvements

    Beyond fixing the core bug, this PR introduces several improvements:

    1. Robust Path Normalization: A complete path normalization system for OAS endpoint paths that handles various formats and edge cases.

    2. New Parser: A dedicated parser for user-defined OAS paths that properly handles regex patterns, path parameters, and other complex path structures.

    3. Mapping System: A new Mapper struct that helps map between original paths and their normalized versions.

    4. Comprehensive Tests: Extensive test coverage for path normalization and parsing, including tests for various path formats and edge cases, such as:

      • Simple paths: /hello/world
      • Paths with regex patterns: /test/.*/end
      • Paths with named parameters: /users/{id}/profile
      • Paths with named regex patterns: /users/{userId:[0-9]+}/posts
      • Paths with colon-separated patterns: /user/id:[0-9]+/accelerate
      • Complex paths with multiple patterns: /users/{userId:[0-9]+}/posts/{postId:[a-z]+}/[a-z]+/{[0-9]{2}}/[a-z]{10}/abc/{id}/def/[0-9]+
      • gRPC-style paths: /v1.Service/stats.Service

    The test suite also includes negative test cases to ensure proper error handling for malformed paths.

    Implementation Details

    The PR introduces a new path normalization package (internal/pathnormalizer) with several key components:

    1. Parser: Responsible for parsing user-defined paths and converting them to a normalized form.
    2. NormalizedPath: Represents a normalized path with its parameters.
    3. Mapper: Maps between original paths and their normalized versions.

    The parser is particularly sophisticated, handling various path formats including:

    • Static path segments
    • Path parameters with and without regex patterns
    • Anonymous regex patterns
    • Multiple regex patterns in a single path

    Conclusion

    This PR fixes a critical bug that would affect any API definition using regex patterns with digits in endpoint paths. The fix is minimal and focused on the specific issue, while also introducing additional improvements to make the path normalization system more robust. The comprehensive test coverage ensures that the fix works correctly for various path formats and edge cases.

    The path normalization functionality is crucial for Tyk Gateway as it allows proper handling of API endpoints with complex path patterns, ensuring correct routing and parameter extraction.


    Tip: Mention me again using /performance <request>.
    Powered by Probe AI

    Copy link
    Contributor

    Analysis of PR #7146: Regex on Endpoint List Issue

    The Bug

    This PR addresses a critical bug in the path normalization functionality of Tyk Gateway, specifically in the handling of regex patterns in endpoint paths. The core issue is in the isDigit function in internal/pathnormalizer/parser.go:

    func isDigit(s byte) bool {
        return s >= '0' && s <= '0'  // Bug: Only recognizes '0' as a digit
    }

    This function is supposed to check if a byte is a digit (0-9), but it's incorrectly checking if the byte is exactly '0' (since the range is from '0' to '0'). This means that only the character '0' will be recognized as a digit, and all other digits ('1' through '9') will not be recognized.

    The Fix

    The PR fixes this by correcting the isDigit function to properly check for all digits:

    func isDigit(s byte) bool {
        return s >= '0' && s <= '9'  // Fixed: Recognizes all digits 0-9
    }

    This simple one-character change ensures that all digits from 0 to 9 are properly recognized during path parsing.

    Impact of the Bug

    This bug has significant implications for path normalization and regex pattern handling:

    1. Path Parsing Failures: Paths containing digits other than '0' would not be properly parsed
    2. Regex Pattern Issues: Regex patterns in endpoint paths that include digits would fail
    3. Incorrect Normalization: Path normalization would produce incorrect results for paths with digits 1-9
    4. API Definition Problems: API definitions with regex patterns containing digits would not work as expected

    For example, a path like /users/123 or a regex pattern like id:[0-9]+ would not be properly parsed because '1', '2', '3', etc. wouldn't be recognized as digits.

    Additional Improvements

    Beyond fixing the core bug, this PR introduces several improvements to the path normalization system:

    1. Robust Path Normalization: A comprehensive system for OAS endpoint paths that handles various formats
    2. New Path Parser: A parser for user-defined OAS paths that correctly handles complex patterns
    3. Format Conversion: Proper conversion between Tyk's extended path format and OpenAPI's path format
    4. Test Coverage: Comprehensive tests for path normalization and parsing

    The PR also addresses two other issues identified during code review:

    1. A typo in the filler function where it was looking for a field named "Path" twice instead of "Method"
    2. Missing error handling in the fillPathsAndOperations function where the error collector's results weren't being checked

    Test Coverage

    The PR includes extensive test cases for path normalization, including paths with:

    • Simple static segments
    • Path parameters with and without regex patterns
    • Multiple parameters and regex patterns
    • Various combinations of static segments, parameters, and regex patterns

    These tests ensure that the path normalization works correctly for all supported path formats and edge cases.

    Conclusion

    This PR addresses a critical bug that would affect any API definition using regex patterns with digits in endpoint paths. The fix is straightforward - changing a single character in the isDigit function - but the impact is significant for API definitions that use regex patterns with digits.

    The comprehensive test coverage added in this PR ensures that the path normalization works correctly for various path formats and edge cases, making the API Gateway more robust in handling different types of endpoint paths.


    Tip: Mention me again using /connectivity <request>.
    Powered by Probe AI

    Copy link

    @shults shults marked this pull request as draft July 10, 2025 10:39
    @shults shults closed this Jul 14, 2025
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    3 participants