OK, not tried anything like this before, but just addding an idea you might want to follow up on....
I guess it depends on whether you are printing from internal network or external, that may be an issue, but ghostscript is generally used, so you could perhaps do some research around that.
If you are thinking along the lines of the the cups pdf printer, it would, of course, need to see a print job (which could be from a lp command line - ie generated from a perl script) but from the internal network, and it can be sent to a folder, which can be scanned by your web app.