engineering-economics-probl.../README.md

# Engineering Economy Problem Generator

This project is a Python-based system designed to procedurally generate engineering economy problems. It focuses on topics such as simple and compound interest, effective interest rates, continuous compounding, Banker's Discount, and date-specific interest calculations (exact and ordinary). The system aims to create varied and realistic word problems with step-by-step solutions, suitable for practice and learning.

## Covered Concepts

The generator currently supports problems related to the following topics (aligned with typical Engineering Economy curricula):

*   **Simple Interest:**
    *   Calculating Future Value (F)
    *   Calculating Present Value (P) from Future Value
    *   Calculating Simple Interest Amount (I)
    *   Calculating Present Value (P) from Interest Amount
    *   Calculating Simple Interest Rate (i)
    *   Calculating Time Period (n)
*   **Types of Simple Interest:**
    *   Exact Simple Interest (using actual days in month/year, including leap year considerations)
    *   Ordinary Simple Interest (using 30-day months, 360-day year)
*   **Banker's Discount:**
    *   Calculating Proceeds
    *   Calculating Discount Rate
    *   Calculating Equivalent Simple Interest Rate
*   **Compound Interest:**
    *   Calculating Future Value (F)
    *   Calculating Present Value (P)
    *   Calculating Nominal Interest Rate (r)
    *   Calculating Time Period (t or n)
*   **Effective Rate of Interest:**
    *   Calculating Effective Rate (ER) given nominal rate and compounding frequency.
*   **Continuous Compounding Interest:**
    *   Calculating Future Value (F)
    *   Calculating Present Value (P)
    *   Calculating Nominal Interest Rate (r)
    *   Calculating Time Period (t)
    *   Calculating Equivalent Simple Interest Rate for a continuously compounded rate.

## System Architecture

The problem generator is a modular system composed of several Python scripts and JSON data files that work together to produce financial problems.

```mermaid
graph LR
    A[problem_engine.py] --> B(data_loader.py);
    A --> C(value_sampler.py);
    A --> D(date_utils.py);
    A --> E(formula_evaluator.py);
    A --> F(narrative_builder.py);
    A --> G(solution_presenter.py);

    B -- "Reads & Caches" --> H["financial_concepts.json"];
    B -- "Reads & Caches" --> I["value_ranges.json"];
    B -- "Reads & Caches" --> J["text_snippets.json"];
    B -- "Reads & Caches" --> K["data/names.json"];

    C -- "Uses Constraints From" --> I;
    C -- "Uses Date Functions From" --> D;
    F -- "Uses Actor Names From" --> K;
    F -- "Uses Text Templates From" --> J;
    G -- "Uses Text Templates From" --> J;

    subgraph "Python Modules"
        A
        B
        C
        D
        E
        F
        G
    end

    subgraph "Data Files (JSON)"
        H["building_blocks/financial_concepts.json"]
        I["data/value_ranges.json"]
        J["data/text_snippets.json"]
        K["data/names.json"]
    end
```

### Python Modules:

*   **`main.py`:**
    *   The main entry point for running the problem generator.
    *   Handles initialization and calls the problem generation engine to produce and display problems.
*   **`problem_engine.py`:**
    *   The central coordinator of the problem generation process.
    *   Selects a financial concept, manages the flow of data between other modules (sampling values, evaluating formulas, building narrative, presenting solution), and assembles the final problem output.
    *   Includes logic for handling special variable types (like dates) and basic plausibility checks (e.g., for interest rate calculations).
    *   Caches loaded data to improve performance.
*   **`data_loader.py`:**
    *   Responsible for loading data from the various JSON configuration files (`financial_concepts.json`, `value_ranges.json`, `text_snippets.json`, `names.json`).
    *   Provides functions to access cached versions of this data.
*   **`value_sampler.py`:**
    *   Generates random numerical values for the variables in a problem (e.g., principal, interest rate, time).
    *   Uses constraints (min, max, precision, type) defined in `data/value_ranges.json`.
    *   Includes functions to format these values for display (e.g., adding currency symbols, percentage signs, thousands separators, formatting dates).
*   **`date_utils.py`:**
    *   Provides utility functions for date-related calculations, crucial for problems involving specific time periods (e.g., exact simple interest).
    *   Includes functions for leap year checks, days in month/year, generating random dates and date periods, and calculating exact days between dates.
*   **`formula_evaluator.py`:**
    *   Safely evaluates mathematical formula strings (defined in `building_blocks/financial_concepts.json`) using a restricted Python `eval()` environment.
    *   Takes a formula string and a context dictionary of variables and their values, then returns the calculated result.
*   **`narrative_builder.py`:**
    *   Constructs the natural language word problem.
    *   Uses templates from `data/text_snippets.json` and actor names/details from `data/names.json`.
    *   Combines these with the sampled numerical values (formatted by `value_sampler.py`) to create a coherent and varied problem statement.
*   **`solution_presenter.py`:**
    *   Generates a step-by-step guided solution for the problem.
    *   Uses `solution_step_keys` defined in `building_blocks/financial_concepts.json` to fetch corresponding solution step templates from `data/text_snippets.json`.
    *   Populates these templates with the specific values and intermediate calculations for the current problem, including a step showing the formula with values substituted.

### Data Files (JSON):

*   **`building_blocks/financial_concepts.json`:**
    *   The core data file defining each type of financial problem the system can generate.
    *   Each "concept" includes:
        *   `concept_id`: A unique identifier.
        *   `description`: Human-readable explanation.
        *   `financial_topic`: The broader category (e.g., "Simple Interest").
        *   `target_unknown`: The variable the problem will ask to solve for.
        *   `variables_involved`: All variables relevant to the concept.
        *   `formulas`: A dictionary of formula strings (including for intermediate variables).
        *   `required_knowns_for_target`: Input variables needed to solve the problem.
        *   `narrative_hooks`: Keywords to guide narrative construction.
        *   `solution_step_keys`: An ordered list of keys mapping to solution step templates in `data/text_snippets.json`.
*   **`data/value_ranges.json`:**
    *   Defines the constraints for generating random numerical values (min, max, currency, units, precision, type like integer/float).
    *   Includes options for compounding frequencies and date generation parameters (like `date_period_generation`).
*   **`data/text_snippets.json`:**
    *   A rich repository of text fragments used by `narrative_builder.py` and `solution_presenter.py`.
    *   Contains templates for:
        *   Actor descriptions (persons, companies).
        *   Actions (loan, investment, repayment verbs).
        *   Time phrases, rate phrases, compounding phrases.
        *   Question starters and scenario framing elements.
        *   Detailed templates for each step of the guided solution (e.g., identifying knowns, stating formula, substituting values).
        *   Human-readable descriptions for variable keys.
*   **`data/names.json`:**
    *   Provides lists of names (titles, first names, last names for persons) and company name components (prefixes, suffixes, industry types).
    *   Also includes lists of items for loan/investment purposes to add flavor to narratives.

## Problem Generation Flow

1.  **Initialization:** `problem_engine.py` (called by `main.py`) loads and caches all necessary data from JSON files via `data_loader.py`.
2.  **Concept Selection:** A financial concept is randomly chosen from `building_blocks/financial_concepts.json`.
3.  **Value Sampling:**
    *   For each `required_knowns_for_target` in the selected concept, `problem_engine.py` instructs `value_sampler.py` to generate a value.
    *   `value_sampler.py` uses `data/value_ranges.json` for constraints.
    *   For date-related concepts (Exact/Ordinary Simple Interest), `problem_engine.py` uses `date_utils.py` to generate start/end dates and calculate the number of days, time base, etc.
    *   Compounding frequencies are also sampled specifically.
    *   Generated values are stored along with their metadata (units, precision).
    *   Basic plausibility checks (e.g., ensuring F > P when solving for rate) are performed, with resampling attempts if needed.
4.  **Formula Evaluation (Intermediate & Final):**
    *   `problem_engine.py` identifies all formulas (intermediate and target) for the concept.
    *   `formula_evaluator.py` evaluates these formulas sequentially, using the sampled known values and any previously calculated intermediate values.
5.  **Narrative Construction:**
    *   `narrative_builder.py` takes the selected concept, the (formatted) known values, and the target unknown variable.
    *   It uses templates from `data/text_snippets.json` and names from `data/names.json` to construct a word problem.
6.  **Solution Generation:**
    *   `solution_presenter.py` uses the `solution_step_keys` from the concept and corresponding templates from `data/text_snippets.json`.
    *   It populates these templates with the actual problem data (knowns, intermediates, calculated answer) to create a step-by-step solution, including showing the formula with values substituted.
7.  **Output:** `problem_engine.py` returns a structured dictionary containing the problem statement, the question, all variable data, the calculated answer (raw and formatted), and the list of solution steps. `main.py` then prints this information.

## Key Features

*   **Data-Driven Design:** Problem types, numerical ranges, and language are defined in external JSON files, enhancing flexibility.
*   **Modularity:** Components for value sampling, formula evaluation, narrative building, and solution presentation are distinct.
*   **Procedural Generation:** Ensures variety in generated problems.
*   **Natural Language Output:** Aims for human-readable problem statements and solutions.
*   **Step-by-Step Solutions:** Provides detailed, guided solutions, including formula substitution.
*   **Precision Control:** Manages internal precision for calculations and display precision for output.
*   **Date Handling:** Robust utilities for date calculations for Exact and Ordinary Simple Interest.
*   **Plausibility Checks:** Basic checks to avoid some unrealistic problem scenarios (e.g., large negative rates).

## Setup and Usage

This project uses `uv` for Python environment and dependency management, as indicated by `uv.lock` and `.python-version`.

1.  **Environment Setup (Recommended):**
    *   Ensure `uv` is installed.
    *   Navigate to the project root directory.
    *   Run `uv sync` (if `pyproject.toml` lists dependencies) or `uv pip install -r requirements.txt` (if a `requirements.txt` file exists or is generated). Refer to `pyproject.toml` for actual dependencies.
2.  **Running the Generator:**
    *   The primary way to generate a problem is by running `main.py` from the project root directory:
        ```bash
        python3 main.py
        ```
    *   The `src/problem_engine.py` script can also be run directly (e.g., `python3 src/problem_engine.py`). Its `if __name__ == '__main__':` block is currently configured to test all available concepts.

## Extensibility

The system is designed to be extensible for adding new problem types that fit the current architectural pattern (solving for a single target unknown via a primary formula, possibly with intermediate calculations).

*   **Adding New Financial Concepts (Similar Architecture):**
    1.  **Define Concept:** Add a new JSON object to `building_blocks/financial_concepts.json`. Specify:
        *   `concept_id` (unique string)
        *   `description` (string)
        *   `financial_topic` (string, e.g., "Simple Annuity")
        *   `target_unknown` (string, e.g., "F_annuity")
        *   `variables_involved` (list of strings, all variables in the formulas)
        *   `formulas` (dictionary: `{"variable_to_calc": "formula_string", ...}`). Ensure formulas for intermediate variables are listed before those that depend on them.
        *   `required_knowns_for_target` (list of strings, input variables to be sampled)
        *   `narrative_hooks` (list of strings, keywords for `narrative_builder.py`)
        *   `solution_step_keys` (list of strings, keys from `text_snippets.json -> solution_guidance`)
    2.  **Value Ranges:** If new variables require specific generation rules, add entries to `data/value_ranges.json`.
    3.  **Text Snippets:**
        *   Add descriptions for any new variables to `data/text_snippets.json` under `variable_descriptions`.
        *   Add new solution step templates to `solution_guidance` if existing ones are insufficient, and use these new keys in your concept's `solution_step_keys`.
        *   Add new narrative phrase templates if needed for unique storytelling.
    4.  **Variable Mapping:** Update `CONCEPT_VAR_TO_VALUERANGE_KEY` in `src/problem_engine.py` if new variables need mapping to `value_ranges.json` keys for sampling.
    5.  **Special Handling (If Any):** If the new concept requires unique logic in `problem_engine.py` (e.g., special date handling, unique plausibility checks) or `solution_presenter.py` (unique formatting for a solution step), those modules would need to be extended.
*   **Adding New Names/Items:** Edit `data/names.json` to add more variety to actors or scenario items.
*   **Adding New Text Snippets:** Edit `data/text_snippets.json` to add more phrasing options for narratives or solution steps.
*   **Modifying Value Ranges:** Adjust `data/value_ranges.json` to change the scope of numerical values generated.

### Future Enhancements / Scope for Advanced Concepts

The current architecture is well-suited for problems that can be solved by evaluating one or more direct formulas to find a target unknown. More complex topics from engineering economy, such as:

*   **Equation of Value / Discrete Payments (with algebraic unknowns):** Problems like the "Acosta Holdings Loan" or "Mr. Cruz's Car Purchase" from the reference notes, where an unknown payment (X) appears in multiple terms of an equation of value and needs to be solved for algebraically.
*   **Arithmetic/Geometric Gradients:** While the "Restaurant Lease - Arithmetic Gradient" problem involves standard formulas (P/A and P/G), integrating these as a distinct, generatable concept requires defining them with their specific variables (A1, G, N) and solution steps.
*   **Annuities (Ordinary, Due, Deferred, Perpetuities):** These have their own sets of formulas and common problem structures.
*   **More Advanced Topics:** Depreciation, Bonds, Capital Budgeting (NPV, IRR), etc.

Implementing these advanced topics would require significant enhancements:

1.  **Equation Solving:** The current `formula_evaluator.py` evaluates expressions but doesn't solve equations algebraically. This would likely require integrating a symbolic math library (like SymPy) or developing custom logic to set up and solve equations of value.
2.  **Complex Concept Definitions:** `building_blocks/financial_concepts.json` would need a more sophisticated structure to define multiple cash flows, their timings, relationships (e.g., payment2 = 1.5 * payment1), and the setup of an equation rather than just direct formulas for a target.
3.  **Enhanced `problem_engine.py`:** Logic to manage multiple cash flows, select focal dates, and coordinate the setup of equations for the solver.
4.  **Advanced `solution_presenter.py`:** New solution step templates and logic to explain:
    *   Drawing cash flow diagrams (descriptively).
    *   Choosing a focal date.
    *   Bringing each cash flow to the focal date.
    *   Setting up the equation of value.
    *   Showing algebraic manipulation to solve for the unknown.
    *   Specific steps for gradient series components.
5.  **Richer `value_sampler.py`:** To generate series of payments, gradient parameters, or multiple related loan/payment amounts.

These represent a next level of complexity beyond the current system's direct formula evaluation approach. A `docs/` folder could be created in the future to detail potential architectures for these advanced features.

## Known Issues / Potential Refinements

*   **Banker's Discount - Negative Proceeds:** For the `BANKERS_DISCOUNT_SOLVE_FOR_PROCEEDS` concept, it's possible to generate scenarios where the calculated discount amount is larger than the maturity value, resulting in negative proceeds. While mathematically possible, this is often an undesirable outcome for typical practice problems. A future refinement could involve adding more specific plausibility checks in `problem_engine.py` to resample input values (like discount rate or time) to ensure positive proceeds.
*   **Plausibility of Complex Scenarios:** As more complex multi-step problems are considered (like Equation of Value), ensuring the generated numbers lead to "sensible" real-world scenarios will require more sophisticated plausibility checks during value sampling.