Why is it that in the year 2025 from our Lord who installs a Python package is still such a gamble?
This message comes from someone who rarely uses python, but consider the following:
- The rare times I have to use it, I am often confronted with the dependency shell (and if you think it is a skill problem, hold that thought and keep reading);
- I am one of the R -Ecosystem for NIX R -Ecosystem, but I am also occasionally packed a number of Python packages for Nix.
This last point is very important, which I think gives me a good perspective on the problem that this blog post is about. When it comes to R -Packages, we know that we can easily mirror cran and bioconductor, because it has already done a lot of curation -efforts: we know that packages work together. However, the same cannot be done for Python: the curation is on us.
If you are using Python to analyze data (I am sorry for you), you are probably touched on this problem: you install a package that is required numpy < 2And another that requires numpy >= 2. You are cookedAs the young people say. The resolver cannot help you because the requirements are literally incompatible. Nobody or anything can help you. No amount of rust written package managers can help you. The problem is PYPI.
Cran does not tolerate this nonsense
This situation simply does not happen in R. Why? Because Cran forces a system where packages are not only tested separately, but against their reverse dependencies. When {ggplot2} or {dplyr} Changes in a way that others break catches cran. Parcel authors receive a warning, and if they do not solve things (within 2 weeks!), Their package will be archived, which means that when users try to install it with install.packages("foo")It won’t work. Which means that if a package is on cran, install.packages("foo") will work. Not “works if you’re lucky.” Not “works if you secure the right versions.” It simply works (of course, as long as the correct dependencies at system level are available if you have to compile it, which is not a problem if you install binaries). In fact, you can’t even publish a package that has limitations on the version of its dependencies. Your package must work forever and always with all packages on Cran. To be honest, quite impressive for something that is not even a real programming language, right? (This is VAT sarcastic)
And Cran manages this consistency 27000 packages. PYPI is much larger, granted, but I doubt whether much more than 30k packages are actually used often. Probably a few thousand, maybe even a few hundred do (especially for data analysis).
PYPI is a warehouse, not an ecosystem
PYPI does not do this. It is a landfill for wheat and wheels. No global checks, no compatibility guarantees, no consistency in the ecosystem. If package A and Package B explain the exclusive requirements, Pypi picks them up and houses them both.
Then we spend enormous efforts on building tools to try to tackle this mess: Conda, Poetry, Hatch, UV, Pipx and Nix (well Nix is āānot specifically made for Python, but it can also be used to set up virtual environments). They are all great tools, but they cannot solve the core problem: if the limitations themselves are impossible, no resolver can save you. In the best case, these tools give you a way to freeze a working mess before it collapses. Just pray to every deity that you like that adding a new package along the line does not explode your environment.
This is not an ecosystem. It is chaos with good packaging tools.
But Nix helps a little more; At least with Nix you can patch a package pyproject.toml To try to relax, as I did saiph:
postPatch=""
# Remove these constraints
substituteInPlace pyproject.toml \
--replace 'numpy = "^1"' 'numpy = ">=1"' \
--replace 'msgspec = "^0.18.5"' 'msgspec = ">=0.18.5"'
'';This step relaxed the limitations directly in the pyproject.tomlBut that may not be a good idea: these limitations may have been there for a good reason. However, unity tests are over (more than 150 of them) so in this specific case I think I’m good. If Pypi was managed as cran, saiphThe authors would have had 2 weeks to check that saiph Worked well with Numpy 2, which seems to be the case here. But patching packages is certainly not a solution for everything.
The scale
“But Python is too big and diverse for Governance in Cran style!” I hear you scream. This is just false. Cran manages 27000 packages in domains as varied as bioinformatics, finances, web scraping, geospatial analysis and machine learning, and this can be done without counting old packages that have been archived over the years. The R -eco system is not small or homogeneous. It is smaller than PYPI in absolute numbers, yes, but to be honest, I doubt that there are more data analysis packages on PYPI than on Cran, and if older ones were not -including Python packages, the number of PYPI packages would also be much smaller. If someone has hard statistics about it, I would like to read them.
The difference is not a technical capacity or ecosystem size. Are philosophy. Cran opted for consistency over tolmission. Pypi chose the opposite.
And no, conda-forge is not enough
Conda-Forge has been compiled because the builds are consistent, compilers are attached, migrations are coordinated. That’s great, and it proves that Python packaging can work on a scale.
But if package A wants numpy < 2 and Package B Wil numpy >= 2Conda-Forge will both host them, and you are still stuck. There is no enforcement mechanism that forces the ecosystem to resolve contradictions. Cran has that. Not conda-forge.
Conda-Forge is a step in the right direction, but a tighter board is needed.
What Python actually needs: PYPAN
Python needs a composite layer on top of PYPI that enforces consistency. Call it PYPAN: The Python Package Archive Network.
This is what PYPAN would do:
- Pypi mirror packages, but only those that ecosystem -wide controls pass
- Test every package against its inverted dependencies, not just himself
- Coordinate migrations for major breaking changes (e.g.
numpy 2.0)) - Archive packages that refuse to adapt
- Publish consistent, installable snapshots of the entire ecosystem
In other words: Cran, but for Python.
If Cran can retain consistency in 27,000 packages (by such small team By the way), Python is also possible. The question is not whether it is technically possible, but whether the Python community is willing to give priority to ecosystem stability above individual parononomy.
Why developers would submit to PYPAN
Why would a parcel author worry? Simple:
- Visibility: Users prefer PYPAN packages because they actually install and work
- Less: fewer bug reports about broken installations or dependence Hell
- Shared responsibility: Migration -effort spread over the ecosystem, not left to individual underpower
- Credibility: “On Pypan” becomes a sign of quality and stability – especially for scientific and industrial projects
If you don’t sign up, fine. But in the end, users will prefer packages that are part of the composite, consistent set. Just as people prefer cran packages in R and not install from Github if possible.
Maybe let’s start small
Cran’s model proves that ecosystem -wide consistency is feasible, and I believe that it could also be feasible on the Python scale. Conda-Forge proves that composite Python packaging works.
Until Python has something like PYPAN, nothing changes. Dependence Hell will keep developers informed at night.
But we could start small. PYPAN could start focusing on data science, analysis and statistics packages – the nuclear scientific python ecosystem. This subset is:
- More manageable: ~ 500-1000 packages (I made up this reach, could be more, could be less, the point is, it is not the 300000 PYPI packages) instead of the entire PYPI
- Very interconnected: Numpy, Pandas, Scikit-Learn, Matplotlib, Scipy form a natural dependency graph
- Stability -oriented: Data scientists give priority to reproducible results above Bleeding-Edge functions
- Community: Scientific Python already coordinates large migrations (Python 2 ā 3, Numpy 2.0)
- Proven question: These users have already been attracted to Conda-Forge for stability
A PYPAN-DS (Data Science) could demonstrate the model works, build trust and create Momentum for wider acceptance. Once people see that pip install pandas (or uv If you prefer) can work as reliable as install.packages('dplyr')Expanding to web frames and other domains becomes much easier to sell.
The scientific python community has cohesion, the need and the precedent for this type of coordination. They can be the Python Cran -Pilot program.
Soooo … who builds this?
Related
#Python #cran #RBloggers

