Programmers wishing to write NDIS protocol drivers for Windows NT have long used the Packet sample in the NT DDK as a learning tool. Some programmers have used the Packet sample as an armature to build their drivers around, adding code but leaving the structure intact. Unfortunately, the sample has bugs. This page attempts to collect warnings about the dangers lurking in the Packet sample. This information in this page is offered in the spirit of cooperation among programmers. The author hopes that he will be contacted by those who find additional bugs in the Packet sample or wish to correct inaccuracies in this page.
Eliyas Yakub of Microsoft has reworked the NT 4.0 Packet sample and he believes he has addressed all the bugs reported on this page. You can download the updated sample from MS Support.
Eliyas reports on his work:
I think I took care of all the bugs. I took out the cleanup routine because it was completely wrong. In the first place, it's not a good idea to reset an adapter just because an application doesn't want to use the adapter any more. Resets are expensive operation. An adapter is not a resource of one application or protocol in the system. Secondly, there was a race condition between the protocol and miniport during freeing of packet. There is no reliable way to do this in the existing architecture. There are plans to introduce new APIs to cancel a pending Recv.I also don't like the idea of using FsContext to save the BindApapterContext. This is not generally recommended. I modified it in the W2K version but I left it as is on NT4.0.
I would like to hear about anyone's experience with the new version of the sample.
We won't list as bugs the design choices in the Packet sample that could have been done better and probably would have been done better in a production driver unless those choices harm the robustness or throughput of the driver. Under this heading come:
IoCompleteRequest()
.
This happens in PacketOpen()
, PacketRead()
,
PacketWrite()
and PacketIoControl()
.
PacketOpen()
does with its stack variable
Medium when calling NdisOpenAdapter()
.
PacketCleanup()
two calls are made to
IoMarkIrpPending()
against the Cleanup IRP
FlushIrp.
PacketCleanup()
an attempt is made to go from a receive packet's list entry in the
packet's ProtocolReserved
to the packet itself with the
use of one invocation of CONTAINING_RECORD
. This wrongly
telescopes the 2 enclosing levels into one, which happens to work just
because the ListElement
entry in
PACKET_RESERVED
comes first and hence has the same
address as PacketReserved
. If the order of structure
members in PACKET_RESERVED
were changed, the driver would
blow up if there were any pending reads at cleanup time. The correct
way to achieve this is to invoke CONTAINING_RECORD
twice:
the first time to get a pointer to the reserved area of the packet and
the second time to get a pointer to the packet itself. The correct
way can be seen in PacketReceiveIndicate()
, by the way. (Thanks
to Alf Ludwig for clarifications on this item.)
PacketIoControl()
there's a typo in which
IOCTL_PROTOCOL_SET_OID
is tested for twice, the test
results being ORed together; one of the tests should have been for
IOCTL_PROTOCOL_QUERY_OID
. Because of this the driver
will never ask the netcard driver for the value of an OID. (Thanks to
Victor Ishikeev for noticing this bug.)
ProtocolReceive
: When
PacketReceiveIndicate
has no use for a packet it returns
NDIS_STATUS_SUCCESS
when it should return
NDIS_STATUS_NOT_ACCEPTED
. According to the chapter on
NDIS Driver Lower-Edge Functions in the DDK's
NetworkReference, this will slow down overall network
throughput. Presumably this would be especially serious in a
multi-protocol setting.
WriteFile()
wrote zero bytes: Due to a bug in
PacketSendComplete() noticed by Skip Hanson,
Irp->IoStatus.Information is zeroed. Skip recommends
NdisQueryPacket(pPacket, NULL, NULL, NULL,
&(pIrp->IoStatus.Information))
to fix this. In my NT protocol
driver I use MmGetMdlByteCount() to show the calling app
the number of bytes sent by the app (as opposed to the grand
total including the framing added by the driver.) You can see this in BgpsSendCompleteHandler() in ntbgps.c in the billgPC zipfile.
But why is this so? This is a point that isn't covered adequately
by what might be called the standard sources
(the DDK's
documentation and Art Baker's The Windows NT Device Driver
Book), so we'll try to be thorough here. In what follows, we
are indebted to Jamie Hanrahan's postings in
comp.os.ms-windows.programmer.nt.kernel-mode as well as
the article on IRP cancel operations in the November-December 1997
issue of The NT Insider. (Responsibility for any
mistakes lies of course with this document's author.)
When all goes well, a driver's cleanup routine will be entered when a process's connection to the device's file object is severed. There are three ways to begin the process of severing the process's connection to the file object:
For the Packet sample's cleanup processing to be executed, it is not enough to begin the process of disconnecting the process from the file object. The cleanup routine will only be entered if all references to the corresponding file object have been eliminated. An IRP created by an I/O request that has not been completed has just such a reference. And the only way NT provides for a driver to clear a thread's references when Win32 tries to disconnect the file object is through IRP cancel routines.
The likely consequences of an application severing its connection to the Packet sample driver with an outstanding I/O (probably a read) include the following:
Rick Rigby was kind enough to contribute a solution to the I/O cancel bug. You can find his code, together with some commentary warning of a danger in it, here.
Another solution to the cancel problem which complements Rigby's but has its own flaws, is in the NT driver for billgPC, the free bootp client by this page's author. One flaw is that the use of the global cancel spinlock where a private spinlock could improve overall system performance. (The normal lifespan of the driver, and the application it supports, is only a few seconds.) The second flaw is a matter of debate. In the first of the two illuminating articles on I/O cancel operations in OSR's The NT Insider the reader is strongly advised to ignore the cancel routine's IRP pointer argument and search the driver's own queue for the first IRP with a cancel flag set. This, the article claims, is the one to dequeue and cancel; the reasons for this are promised in the second installment. Unfortunately, the second part of the article lacks an explanation for this algorithm, and billgPC's driver does heed the IRP pointer argument. To see billgPC's implementation, search for BgpsCancelReadIrp in ntbgps.c in the billgPC zipfile.