We recently ran the Making Software Greener code analyzer across nine widely used Python libraries: boto3, django, fastapi, flask, jinja, numpy, pandas, pytorch, and requests. The goal was not to single out these projects, but to learn from real-world code that millions of developers rely on every day. By looking at what shows up most often, we can better understand which mistakes are easy to make—and where small improvements could lead to cleaner, faster, and more sustainable software.

What We Found

The analyzer flagged a mix of performance inefficiencies and general code issues. Here are the top results, ranked by how often they appeared:

IssueCount
Using print(...) in library code (prefer logging)3256
String concatenation in a loop (use list + ''.join(...) or io.StringIO)2524
Broad except Exception (narrow to specific exceptions)1273
df.append(...) inside a loop (accumulate rows then use pd.concat)569
Membership test on a sequence in a loop (prefer set for O(1))542
subprocess.run(...) without a timeout165
subprocess.check_output(...) without a timeout160
open(...) not used as a context manager (use with open(...) as f:)133
Unnecessary list materialization for max(...) (prefer a generator)127
Unnecessary list materialization for sum(...) (prefer a generator)104

Why It Matters

These issues aren’t catastrophic bugs. Most of the time, the code still works as intended. But they do add up:

  • Performance: String concatenation in loops or appending to a DataFrame row by row can quietly slow things down. In large-scale systems, that wasted time can also mean wasted energy.
  • Reliability: Broad exception handling can make it hard to trace problems or handle errors in a safe way.
  • Maintainability: Using print(...) instead of logging makes it harder to control output, filter messages, or integrate with monitoring tools.
  • Resource use: Omitting timeouts in subprocess calls risks hanging processes, which can lock up resources longer than necessary.

Learning from Patterns

The fact that these issues show up so often in mature, widely used libraries highlights how easy they are to introduce. It’s a reminder that even experienced teams can fall into patterns that work “well enough” in the moment but have hidden costs over time.

Tools like the Making Software Greener analyzer are designed to surface these patterns so teams can spot them early. The benefit isn’t only cleaner code—it’s also more efficient and sustainable software in the long run.

Moving Forward

If you’re working on Python projects of your own, here are some simple practices to keep in mind:

  • Use logging instead of print(...) for better control and flexibility.
  • Favor list joins or StringIO over concatenation inside loops.
  • Be specific with exception handling.
  • Collect DataFrame rows and use pd.concat once, instead of appending repeatedly.
  • Add timeouts to subprocess calls to prevent runaway processes.
  • Use context managers (with open(...)) to ensure files are closed properly.

These small shifts in habit can make your code faster, more reliable, and easier to maintain—all while reducing wasted computation.

Tags

No responses yet

Leave a Reply

Your email address will not be published. Required fields are marked *