|
|
Subscribe / Log in / New account

MSI-HOWTO.txt


		The MSI Driver Guide HOWTO
	Tom L Nguyen [email protected]
			07/15/2003

1. About this guide

This guide describes the basics of Message Signaled Interrupts(MSI), the 
advantages of using MSI over traditional interrupt mechanisms, and how 
to enable your driver to use MSI or MSI-X. Also included is a 
description of debugging features available and Frequently Asked 
Questions.

2. Copyright 2003 Intel Corporation 

3. What is MSI/MSI-X?

Message Signaled Interrupt (MSI), as described in the PCI Local Bus 
Specification Revision 2.3 or latest, is an optional feature, and a required 
feature for PCI Express devices. MSI enables a device function to request 
service by sending an Inbound Memory Write on its PCI bus to the FSB as a 
Message Signal Interrupt transaction. Because MSI is generated in the
form of a Memory Write, all transaction conditions, such as a Retry, 
Master-Abort, Target-Abort or normal completion, are supported.

A PCI device that supports MSI must also support pin IRQ assertion interrupt 
mechanism to provide backward compatibility for systems that do not
support MSI. In Systems, which support MSI, the bus driver is
responsible for initializing the message address and message data of
the device function's MSI/MSI-X capability structure during device
initial configuration. 

An MSI capable device function indicates MSI support by implementing 
the MSI/MSI-X capability structure in its PCI capability list. The 
device function may implement both the MSI capability structure and the 
MSI-X capability structure; however, the bus driver should not enable 
both, but instead enable only the MSI-X capability structure.

The MSI capability structure contains Message Control register,
Message Address register and Message Data register. These registers
provide the bus driver control over MSI. The Message Control register
indicates the MSI capability supported by the device. The Message
Address register specifies the target address and the Message Data
register specifies the characteristics of the message. To request
service, the device function writes the content of the Message Data
register to the target address. The device and its software driver 
are prohibited from writing to these registers.

The MSI-X capability structure is an optional extension to MSI. It uses 
an independent and separate capability structure. There are some key 
advantages to implementing the MSI-X capability structure over the MSI 
capability structure as described below.

	- Support a larger maximum number of vectors per function.

	- Provide the ability for system software to configure each
	  vector with an independent message address and message data,
	  specified by a table that resides in Memory Space. 

        - MSI and MSI-X both support per-vector masking. Per-vector
	  masking is an optional extension of MSI but a required
	  feature for MSI-X. Per-vector masking provides the kernel
	  the ability to mask/unmask MSI when servicing its software
	  interrupt service routing handler. If per-vector masking is
	  not supported, then the device driver should provide the 
	  hardware/software synchronization to ensure that the device generates
	  MSI when the driver wants it to do so. 

4. Why use MSI? 

As a benefit the simplification of board design, MSI allows board designers to 
remove out of band interrupt routing. MSI is another step towards a legacy-free
environment.

Due to increasing pressure on chipset and processor packages to reduce pin 
count, the need for interrupt pins is expected to diminish over time. Devices, 
due to pin constraints, may implement messages to increase performance. 

PCI Express endpoints uses INTx emulation (in-band messages) instead of IRQ pin
assertion. Using INTx emulation requires interrupt sharing among devices 
connected to the same node (PCI bridge) while MSI is unique (non-shared) and 
does not require BIOS configuration support. As a result, the PCI Express 
technology requires MSI support for better interrupt performance.
 
Using MSI enables the device functions to support two or more vectors,
which can be configure to target different CPU's to increase scalability. 

5. Configuring a driver to use MSI/MSI-X

By default, the kernel will not enable MSI/MSI-X on all 
devices that support this capability because some devices. A kernel 
configuration option must be selected to enable MSI/MSI-X support.

5.1 Including MSI support into the kernel

To include MSI support into the kernel requires users to rebuild 
the kernel with both the configuration parameters CONFIG_PCI_USE_VECTOR and 
CONFIG_PCI_MSI set. CONFIG_PCI_USE_VECTOR enables the kernel to
replace the IRQ-based scheme with VECTOR-based scheme because MSI
requires a unique vector and no BIOS interrupt-routing
table. CONFIG_PCI_MSI enables MSI support in the kernel. 

During PCI device enumeration, the bus driver initializes the devices
MSI/MSI-X capability structure with ONE vector, regardless of whether
the device function is capable of supporting multiple vectors. 

ONE vector is initially allocated to the device function and the vector is 
stored in the irq field of the device (pci_dev) structure. This default 
initialization allows legacy drivers to work without specific modification to 
support MSI.
	
5.2 Configuring for MSI support

Due to the non-contiguous fashion in vector assignment of the 
existing Linux kernel, this patch does not support multiple 
messages regardless of the device function is capable of 
supporting more than one vector. The bus driver initializes only 
entry 0 of this capability. Existing software drivers of this 
device function will work without changes if no 
hardware/software synchronization is required. Otherwise, the 
device driver should be updated to provide the hardware/software 
synchronization due to multiple messages generated from the same 
vector might be lost. In other words, once the device function 
signals Vector A, it cannot signal Vector A again until it is 
explicitly enabled to do so by its device driver. It is 
recommended that IHVs should validate their hardware devices 
against their existing device drivers once the patch is 
installed. Please refer section 5.4 Debugging MSI.

5.3 Configuring for MSI-X support

Both the MSI capability structure and the MSI-X capability 
structure share the same above semantics; however, due to the 
ability of the system software to configure each vector of the 
MSI-X capability structure with an independent message address 
and message data, the non-contiguous fashion in vector assignment 
of the existing Linux kernel has no impact on supporting multiple 
messages on an MSI-X capable device functions. By default, as 
mentioned above, ONE vector should be always allocated to the 
MSI-X capability structure at entry 0. The bus driver does not 
initialize other entries of the table during device enumeration. 
Note that the PCI subsystem should have full control of a MSI-X table that 
resides in Memory Space. The software device driver should not access this 
table. 

To request for additional vectors, the device software driver 
should call function msix_alloc_vectors(). It is recommended that 
the software driver should call this function once during the 
initialization phase of the device driver. With this semantics, 
the existing software device driver may work with one vector if 
no hardware/software synchronization is required. It is 
recommended that IHVs should validate their hardware devices 
against their existing device drivers once the patch is 
installed. Please refer section 5.4 Debugging MSI.

The function msix_alloc_vectors(), once invoked, enables either 
all or nothing, depending on the current availability of vector 
resources. If no vector resources are available, the device 
function still works with ONE vector. If the vector resources are 
available for the number of vectors requested by the driver, this 
function will reconfigure the MSI-X capability structure of the 
device with additional messages, starting from entry 1. To 
emphasize this reason, for example, the device may be capable for 
supporting the maximum of 32 vectors while its software driver 
usually may request 4 vectors.

For each vector, after this successful call, the device driver is 
responsible to call other functions like request_irq(), 
enable_irq(), etc. to enable this vector with its corresponding 
interrupt service handler. It is the device driver's choice to 
have all vectors shared the same interrupt service handler or 
each vector with a unique interrupt service handler. 

In addition to the function msix_alloc_vectors(), another 
function msix_free_vectors() is provided to allow the software 
driver to release a number of vectors back to the vector 
resources. Once invoked, the PCI subsystem disables (masks) each 
vector released. These vectors are no longer valid for the 
hardware device and its software driver to use.

int msix_alloc_vectors(struct pci_dev *dev, int *vector, int nvec)

This API enables the software driver to request the PCI
subsystem for additional messages. Depending on the number of 
vectors available, the PCI subsystem enables either all or 
nothing. 

Argument dev points to the device (pci_dev) structure.
Argument vector is a pointer of integer type. The number of 
elements is indicated in argument nvec.
Argument nvec is an integer indicating the number of messages 
requested.
A return of zero indicates that the number of allocated vector is 
successfully allocated. Otherwise, indicate resources not 
available.

int msix_free_vectors(struct pci_dev* dev, int *vector, int nvec)

This API enables the software driver to inform the PCI subsystem 
that it is willing to release a number of vectors back to the MSI 
resource pool.  Once invoked, the PCI subsystem disables each 
MSI-X entry associated with each vector stored in the argument 2. 
These vectors are no longer valid for the hardware device and its 
software driver to use.

Argument dev points to the device (pci_dev) structure.
Argument vector is a pointer of integer type. The number of 
elements is indicated in argument nvec.
Argument nvec is an integer indicating the number of messages 
released. 
A return of zero indicates that the number of allocated vectors 
is successfully released. Otherwise, indicates a failure.

5.4 Debugging MSI

There are some devices that may have some bugs in MSI. These devices may break 
once MSI support is invoked in the kernel. To debug these devices, the patch 
provides two configuration parameters, CONFIG_PCI_MSI and 
CONFIG_PCI_MSI_ON_SPECIFIC_DEVICES. Both of them are not set by default. When 
users set CONFIG_PCI_MSI, CONFIG_PCI_MSI_ON_SPECIFIC_DEVICES is also set by 
default. After users rebuild the kernel with this combination, the
kernel enables MSI on specific devices listed in the boot parameter 
"device_msi=". Users must explicitly use this boot parameter to
provide a list of specific devices they would like to have MSI
support. To emphasize this reason, users can debug on individual MSI
capable device with its existing software driver until all are fully
validated since it may be difficult to debug all the same time. The format 
of "device_msi=" is similar to the format of "device_nomsi=" and will be 
described in later paragraph. Note that this boot parameter is
required only if the configuration parameter CONFIG_PCI_MSI_ON_SPECIFIC_DEVICES
is set. Otherwise, it will be ignored.  

Once users completed validating these devices, they can clear the
configuration parameter CONFIG_PCI_MSI_ON_SPECIFIC_DEVICES to indicate
the kernel that MSI should be enabled on all MSI capable devices. The
boot parameter "device_msi=" is no longer required.

The patch also provides second debug option, which requires users set the 
configuration parameter CONFIG_PCI_MSI and clear configuration parameter 
CONFIG_PCI_MSI_ON_SPECIFIC_DEVICES. After users rebuild the kernel with this 
combination, the kernel enables MSI on all MSI capable devices by default. The 
boot parameter "device_msi=" will be ignored. To disable MSI on
specific MSI capable devices, which may show some signs of
unpredictable behaviors, users must explicitly use the boot parameter 
"device_nomsi=", which contains a list of specific devices users do
not want MSI enabled. These devices are default to IRQ pin assertion.

The format of this is "device_nomsi=DWORD1,DWORD2,...". Each 
DWORD in a list specifies a device function in terms of device ID 
(higher word) and vendor ID (lower word). DWORD should be in hex 
format with a prefix 0x.

For example, "device_nomsi=0x80119005,0x10108086" indicates that 
the bus driver should not enable MSI(X) on two device functions 
(Device ID = 0x8011 & Vendor ID = 0x9005, and Device ID = 0x1010 
and Vendor ID = 0x8086). 
	
In addition to the boot parameter "device_nomsi=", another boot 
parameter "pci_nomsi" can be used to prohibit the bus driver from 
enabling MSI(X) on all MSI capable devices.

At the driver level, the software device driver can tell whether 
MSI/MSI-X is enabled by reading the MSI enable bit of the 
MSI/MSI-X capability structure's message control register. If 
this bit is zero, the device function is default to IRQ pin 
assertion. If this bit is set, the device function is using MSI 
as interrupt generated mechanism.

At the user level, users can use command 'cat /proc/interrupts' 
to display the vector allocated for the device and its interrupt 
mode, as shown below. 

     CPU0     CPU1    CPU2    CPU3       
0:   14175    0       17408   0    	IO-APIC-edge  	timer
1:   123      310     0       37    	IO-APIC-edge  	keyboard
2:   0        0       0       0     	XT-PIC  	cascade
8:   1        0       0       0    	IO-APIC-edge  	rtc
12:  41       0       0       813   	IO-APIC-edge  	PS/2 Mouse
14:  2744     7017    0       0     	IO-APIC-edge  	ide0
15:  1515     1       0       418   	IO-APIC-edge  	ide1
169: 0        0       0       0   	IO-APIC-level  	usb-uhci
185: 0        0       0       0   	IO-APIC-level  	usb-uhci
193: 30       0       0       0     	PCI MSI		aic79xx
201: 30       0       0       0     	PCI MSI		aic79xx
209: 467      0       0       0   	IO-APIC-level  	eth1
225: 15       0       0       0   	IO-APIC-level  	aic7xxx
233: 15       0       0       0   	IO-APIC-level  	aic7xxx
NMI: 0        0       0       0 
LOC: 31446    31448   31448   31448 
ERR: 0
MIS: 0

6. FAQ

Q1. Are there any limitations on using the MSI?

A1. If the PCI device supports MSI and conforms to the
specification and the platform supports the APIC local bus,
then using MSI should work.

Q2. Will it work on all the Pentium processors (P3, P4, Xeon,
AMD processors)? In P3 IPI's are transmitted on the APIC local
bus and in P4 and Xeon they are transmitted on the system
bus. Are there any implications with this?

A2. MSI support enables a PCI device sending an inbound
memory write (0xfeexxxxx as target address) on its PCI bus
directly to the FSB. Since the message address has a
redirection hint bit cleared, it should work.

Q3. The target address 0xfeexxxxx will be translated by the
Host Bridge into an interrupt message. Are there any
limitations on the chipsets such as Intel 8xx, Intel e7xxx,
or VIA?

A3. If these chipsets support an inbound memory write with
target address set as 0xfeexxxxx, as conformed to PCI specification 2.3 or 
latest, then it should work.

Q4. From the driver point of view, if the MSI is lost because
of the errors occur during inbound memory write, then it may
wait for ever. Is there a mechanism for it to recover?

A4. Since the target of the transaction is an inbound memory
write, all transaction termination conditions (Retry,
Master-Abort, Target-Abort, or normal completion) are
supported. A device sending an MSI must abide by all the PCI
rules and conditions regarding that inbound memory write. So,
if a retry is signaled it must retry, etc... We believe that
the recommendation for Abort is also a retry (refer to PCI
specification 2.3 or latest).







to post comments


Copyright © 2003, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds