I can't really help with the filter design - the most straightforward way would probably be to make an external using the Pd code or other reference code in c or java, but I have no idea how this works or how tricky it is.
Two less than stellar options in Max: Try as you suggested, but with tapin~/tapout~ or send~/receive~ in a poly~ with vector size 1 to achieve the one sample delay (very inefficient). Or use pd~ inside Max, see http://crca.ucsd.edu/~msp/software.html (don't know which Max/Pd versions you'll need).
My guess is that PD's filter definition is the same as ISPW's 2p2z~. I found a 2p2z~ abstraction, part of the ISPW compatibility lib. It shows that values should not only be swapped, as correctly stated but that the b-coeffs should also be negated. Does that work?