Contributing to RegressionLab

Thank you for your interest in contributing to RegressionLab! This guide will help you get started with contributing code, documentation, or other improvements to the project.

Ways to Contribute

There are many ways to contribute to RegressionLab:

  • πŸ› Report bugs: Help us identify issues.

  • πŸ’‘ Suggest features: Share your ideas for improvements.

  • πŸ“ Improve documentation: Fix typos, add examples, clarify explanations.

  • πŸ”§ Fix bugs: Submit patches for known issues.

  • ✨ Add features: Implement new functionality.

  • πŸ§ͺ Write tests: Improve code coverage.

  • 🌍 Add translations: Support more languages.

  • πŸ“Š Add equations: Contribute new fitting functions.

  • 🎨 Improve UI/UX: Enhance user experience.

Getting Started

1. Fork and Clone

# Fork the repository on GitHub (click "Fork" button)

# Clone your fork
git clone https://github.com/DOKOS-TAYOS/RegressionLab.git
cd RegressionLab

# Add upstream remote
git remote add upstream https://github.com/DOKOS-TAYOS/RegressionLab.git

2. Set Up Development Environment

# Create virtual environment
python -m venv .venv

# Activate virtual environment
source .venv/bin/activate  # macOS/Linux
.venv\Scripts\activate     # Windows

# Install dependencies including development tools (pytest, ruff, black, mypy, pre-commit)
pip install -r requirements-dev.txt
# Or install from pyproject.toml with dev extras:
# pip install -e ".[dev]"

# Install the package in editable mode (if not already done by the line above)
pip install -e .

3. Create a Branch

# Update main branch
git checkout main
git pull upstream main

# Create feature branch
git checkout -b feature/your-feature-name
# or for bug fixes:
git checkout -b fix/issue-description

Branch Naming Conventions:

  • feature/ - New features.

  • fix/ - Bug fixes.

  • docs/ - Documentation updates.

  • refactor/ - Code refactoring.

  • test/ - Adding or updating tests.

Development Guidelines

Code Style

RegressionLab follows PEP 8 with some project-specific conventions:

1. Line Length

  • Maximum 100 characters per line.

  • For long strings, use implicit concatenation or textwrap.

# Good
message = (
    "This is a very long message that spans "
    "multiple lines for better readability."
)

# Bad
message = "This is a very long message that spans multiple lines for better readability."

2. Type Hints

Always include type hints for function signatures:

from typing import Optional, Tuple
from numpy.typing import NDArray
import numpy as np

def process_data(
    data: NDArray[np.floating],
    threshold: float = 0.5,
    normalize: bool = True
) -> Tuple[NDArray[np.floating], float]:
    """Process data with optional normalization."""
    ...

3. Docstrings

Use Google-style docstrings:

def fit_curve(data: pd.DataFrame, equation: str) -> Tuple[np.ndarray, float]:
    """
    Fit a curve to the provided data.
    
    This function performs nonlinear least squares fitting using
    scipy.optimize.curve_fit with automatic initial parameter estimation.
    
    Args:
        data: DataFrame containing x, y, and optional uncertainty columns
        equation: Name of the equation to fit (e.g., 'linear_function')
        
    Returns:
        Tuple containing:
            - Fitted parameters as ndarray
            - R-squared value as float
            
    Raises:
        FittingError: If the fitting algorithm fails to converge
        ValueError: If equation name is not recognized
        
    Examples:
        >>> data = pd.DataFrame({'x': [1, 2, 3], 'y': [2, 4, 6]})
        >>> params, r2 = fit_curve(data, 'linear_function')
        >>> params
        array([2.0])
        >>> r2
        1.0
    """
    ...

4. Imports

Organize imports in three groups, separated by blank lines:

# Standard library
import os
import sys
from pathlib import Path
from typing import Optional, List

# Third-party packages
import numpy as np
import pandas as pd
from scipy.optimize import curve_fit

# Local imports
from config import AVAILABLE_EQUATION_TYPES
from fitting.fitting_utils import generic_fit
from utils.exceptions import FittingError
from utils.logger import get_logger

5. Naming Conventions

  • Functions/variables: snake_case

  • Classes: PascalCase

  • Constants: UPPER_SNAKE_CASE

  • Private members: _leading_underscore

# Good
MAX_ITERATIONS = 1000

class DataLoader:
    def __init__(self):
        self._cache = {}
    
    def load_file(self, file_path: str) -> pd.DataFrame:
        ...
    
    def _parse_header(self, line: str) -> List[str]:
        ...

6. Comments

  • Use comments sparingly - prefer self-documenting code

  • Comments should explain why, not what

  • Keep comments up-to-date when code changes

# Good - explains why
# Use absolute_sigma=True to treat uncertainties as absolute values,
# not relative weights, which is correct for experimental data
popt, pcov = curve_fit(func, x, y, sigma=uy, absolute_sigma=True)

# Bad - states the obvious
# Call curve_fit function
popt, pcov = curve_fit(func, x, y)

Testing

Writing Tests

  1. Location: Place tests in tests/ directory

  2. Naming: Test files should match source files: test_<module>.py

  3. Structure: Use pytest conventions

# tests/test_fitting_functions.py
import numpy as np
import pandas as pd
import pytest
from fitting.fitting_functions import func_lineal, ajlineal


class TestFuncLineal:
    """Tests for func_lineal mathematical function."""
    
    def test_scalar_input(self):
        """Test with scalar input."""
        result = func_lineal(5.0, 2.0)
        assert result == 10.0
    
    def test_array_input(self):
        """Test with array input."""
        t = np.array([1, 2, 3])
        result = func_lineal(t, 2.0)
        expected = np.array([2, 4, 6])
        np.testing.assert_array_equal(result, expected)
    
    @pytest.mark.parametrize("t,m,expected", [
        (0, 5, 0),
        (3, 2, 6),
        (-2, 4, -8),
    ])
    def test_various_inputs(self, t, m, expected):
        """Test with various parameter combinations."""
        assert func_lineal(t, m) == expected


class TestAjlineal:
    """Tests for ajlineal fitting function."""
    
    def test_perfect_linear_fit(self):
        """Test fitting with perfect linear data."""
        x = np.linspace(0, 10, 50)
        y = 3.0 * x  # Perfect linear relationship
        
        data = pd.DataFrame({'x': x, 'y': y})
        
        param_text, y_fitted, equation, r_squared = ajlineal(data, 'x', 'y')
        
        # RΒ² should be nearly 1 for perfect fit
        assert r_squared > 0.9999
        
        # Fitted values should match original data
        np.testing.assert_array_almost_equal(y_fitted, y, decimal=10)
    
    def test_noisy_data(self):
        """Test fitting with noisy data."""
        np.random.seed(42)  # Reproducibility
        x = np.linspace(0, 10, 100)
        y = 2.5 * x + np.random.normal(0, 0.5, 100)
        
        data = pd.DataFrame({'x': x, 'y': y})
        
        param_text, y_fitted, equation, r_squared = ajlineal(data, 'x', 'y')
        
        # Should still get good fit despite noise
        assert r_squared > 0.95
    
    def test_with_uncertainties(self):
        """Test fitting with uncertainty columns."""
        x = np.linspace(0, 10, 50)
        y = 3.0 * x
        
        data = pd.DataFrame({
            'x': x,
            'y': y,
            'ux': np.ones_like(x) * 0.1,
            'uy': np.ones_like(y) * 0.2
        })
        
        param_text, y_fitted, equation, r_squared = ajlineal(data, 'x', 'y')
        
        assert r_squared > 0.99
    
    def test_raises_on_invalid_data(self):
        """Test that appropriate errors are raised for invalid data."""
        with pytest.raises(KeyError):
            # Missing column
            data = pd.DataFrame({'x': [1, 2, 3]})
            ajlineal(data, 'x', 'y')

Running Tests

# Run all tests
pytest tests/
# Or: python tests/run_tests.py
# Or use launcher: bin\run_tests.bat (Windows) / bin/run_tests.sh (Linux/macOS)

# Run specific test file
pytest tests/test_fitting_functions.py

# Run specific test
pytest tests/test_fitting_functions.py::TestFuncLineal::test_scalar_input

# Run with coverage
pytest tests/ --cov=src --cov-report=html

# Run with verbose output
pytest tests/ -v

# Run tests in parallel (requires pytest-xdist)
pytest tests/ -n auto

Check coverage:

pytest tests/ --cov=src --cov-report=term-missing

Adding New Fitting Functions

See Extending RegressionLab for detailed guide.

Summary:

  1. Add mathematical and fitting functions in src/fitting/functions/ (e.g. special.py, polynomials.py)

  2. Export the fit function from src/fitting/functions/__init__.py, then register in src/config/equations.yaml (add entry with function, formula, format, param_names)

  3. Add translations to src/locales/ (en.json, es.json, de.json)

  4. Write tests in tests/test_fitting_functions.py

  5. Update documentation

Adding Translations

To add a new language:

  1. Create locale file: src/locales/<language_code>.json

{
  "menu": {
    "normal_fitting": "Your translation",
    "multiple_datasets": "Your translation"
  },
  "dialog": {
    "select_file": "Your translation"
  }
}
  1. Update config: Add the language code to SUPPORTED_LANGUAGE_CODES and LANGUAGE_ALIASES in src/config/constants.py

  2. Test thoroughly: Check all UI elements in both interfaces

  3. Update documentation: Add language to README and docs

Documentation

Updating Documentation

  • Documentation is in docs/ directory

  • Use Markdown format

  • Follow existing structure and style

  • Include code examples where appropriate

  • Add screenshots for UI changes (place in docs/images/)

Building Documentation

# Install documentation dependencies
pip install -r sphinx-docs/requirements.txt

# Build HTML documentation
cd sphinx-docs
./build_docs.sh  # Linux/macOS
build_docs.bat   # Windows

# View documentation
./open_docs.sh   # Linux/macOS
open_docs.bat    # Windows

Submitting Changes

1. Commit Your Changes

Write clear, descriptive commit messages:

# Good commit messages
git commit -m "Add exponential decay fitting function"
git commit -m "Fix unicode encoding issue in CSV loader"
git commit -m "Improve error message for missing uncertainty columns"
git commit -m "Update installation instructions for Windows"

# Bad commit messages (too vague)
git commit -m "Fixed stuff"
git commit -m "Update"
git commit -m "WIP"

Commit Message Format:

<type>: <short summary>

<detailed description (optional)>

<issue reference (if applicable)>

Types:

  • feat: New feature.

  • fix: Bug fix.

  • docs: Documentation changes.

  • style: Code style changes (formatting, no logic change).

  • refactor: Code refactoring.

  • test: Adding or updating tests.

  • chore: Maintenance tasks.

Example:

feat: Add logistic growth fitting function

Implements sigmoid/logistic growth curve fitting with
three parameters: carrying capacity (L), growth rate (k),
and midpoint (t0).

Closes #42

2. Push to Your Fork

git push origin feature/your-feature-name

3. Create Pull Request

  1. Go to your fork on GitHub

  2. Click β€œNew Pull Request”

  3. Select your branch

  4. Fill out the pull request template:

    • Title: Clear, concise summary

    • Description: What changes were made and why

    • Testing: How you tested the changes

    • Screenshots: For UI changes

    • Closes: Reference any related issues

Pull Request Checklist

Before submitting, ensure:

  • Code follows project style guidelines

  • All tests pass (pytest tests/)

  • New tests added for new functionality

  • Documentation updated (if needed)

  • Commit messages are clear and descriptive

  • No unnecessary files included (check .gitignore)

  • Type hints included

  • Docstrings added for new functions

  • Translations added (if UI changes)

  • CHANGELOG.md updated (for significant changes)

Review Process

  1. Automated checks: CI/CD runs tests automatically

  2. Code review: Maintainer reviews your code

  3. Feedback: Address any requested changes

  4. Approval: Once approved, changes are merged

  5. Cleanup: Delete your branch after merge

Development Setup

VS Code Setup

Recommended extensions:

  • Python (Microsoft)

  • Pylance

  • Python Test Explorer

  • GitLens

  • Markdown All in One

Workspace settings (.vscode/settings.json):

{
    "ruff.enable": true,
    "python.formatting.provider": "black",
    "python.testing.pytestEnabled": true,
    "editor.rulers": [100],
    "files.trimTrailingWhitespace": true,
    "files.insertFinalNewline": true
}

The project uses Ruff for linting (see requirements-dev.txt and pyproject.toml optional [dev] dependencies). Ruff reads configuration from pyproject.toml. Install the Ruff extension for VS Code.

Project Structure

Understanding the codebase:

RegressionLab/
β”œβ”€β”€ src/                          # Source code
β”‚   β”œβ”€β”€ config/                  # Configuration (env, theme, paths, constants, equations.yaml)
β”‚   β”œβ”€β”€ i18n.py                  # Internationalization
β”‚   β”œβ”€β”€ main_program.py          # Tkinter entry point
β”‚   β”œβ”€β”€ fitting/                 # Curve fitting (functions/, fitting_utils, workflow_controller)
β”‚   β”œβ”€β”€ frontend/                # Tkinter UI (ui_main_menu, image_utils, ui_dialogs/)
β”‚   β”œβ”€β”€ loaders/                 # Data loaders, CSV/Excel
β”‚   β”œβ”€β”€ plotting/                # Plot utilities
β”‚   β”œβ”€β”€ streamlit_app/           # Streamlit web app (app.py, sections/)
β”‚   β”œβ”€β”€ locales/                 # Translation JSON (en, es, de)
β”‚   └── utils/                   # Exceptions, logger, validators
β”œβ”€β”€ tests/                       # Pytest suite (run_tests.py, conftest.py, test_*.py)
β”œβ”€β”€ docs/                        # User documentation (Markdown)
β”œβ”€β”€ sphinx-docs/                 # Sphinx sources and build scripts
β”œβ”€β”€ input/                       # Sample datasets
β”œβ”€β”€ output/                      # Generated plots
β”œβ”€β”€ bin/                         # Launchers (run, run_streamlit, run_tests)
β”œβ”€β”€ scripts/                     # Helper scripts (clean, generate_test_datasets, generate_multi_var_dataset)
β”œβ”€β”€ install.bat                  # Windows installation script
β”œβ”€β”€ install.sh                   # Linux/macOS installation script
β”œβ”€β”€ setup.bat                    # Windows setup script
β”œβ”€β”€ setup.sh                     # Linux/macOS setup script
β”œβ”€β”€ .env.example                 # Sample environment configuration (dotenv)
β”œβ”€β”€ .gitignore                   # git ignore rules
β”œβ”€β”€ requirements.txt             # Python dependencies (runtime + Streamlit, Pillow)
β”œβ”€β”€ requirements-dev.txt         # Developer dependencies (pytest, ruff, black, mypy, pre-commit)
β”œβ”€β”€ pyproject.toml               # Project metadata, build config, and optional [dev] / [docs] deps
β”œβ”€β”€ README.md                    # Project overview/readme
β”œβ”€β”€ CHANGELOG.md                 # Project changelog
└── LICENSE                      # License file

Communication

Asking Questions

  • GitHub Discussions: For general questions and ideas

  • GitHub Issues: For bug reports and feature requests

  • Email: For private inquiries

Reporting Bugs

Use the GitHub issue template and include:

  1. Title: Clear, specific description

  2. Version: RegressionLab version

  3. Environment: OS, Python version

  4. Steps to reproduce: Exact steps

  5. Expected behavior: What should happen

  6. Actual behavior: What actually happens

  7. Error messages: Full traceback

  8. Sample data: If possible

  9. Screenshots: For UI issues

Suggesting Features

Use the GitHub feature request template:

  1. Problem description: What problem does this solve?

  2. Proposed solution: How should it work?

  3. Alternatives: Other solutions considered

  4. Additional context: Examples, mockups, etc.

Code of Conduct

Our Standards

  • Be respectful: Treat everyone with respect.

  • Be constructive: Provide helpful feedback.

  • Be patient: Everyone is learning.

  • Be professional: Keep discussions on-topic.

Unacceptable Behavior

  • Harassment or discriminatory language.

  • Personal attacks.

  • Trolling or inflammatory comments.

  • Publishing private information.

  • Other unprofessional conduct.

Enforcement

Violations may result in:

  1. Warning.

  2. Temporary ban.

  3. Permanent ban.

Report issues to: alejandro.mata.ali@gmail.com

License

By contributing to RegressionLab, you agree that your contributions will be licensed under the MIT License.

Recognition

Contributors may be recognized in:

  • A CONTRIBUTORS.md file (if added to the project).

  • Release notes.

  • Documentation credits.

Thank you for contributing to RegressionLab! πŸŽ‰


Questions about contributing? Open a GitHub Discussion or email alejandro.mata.ali@gmail.com.