Skip to content

Harsh-06/Malware_Classifier_For_PE_Files

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Malware_Classifier_For_PE_Files

Command line tool for scoring PE format file on PE header's data.

Dataset Features characteristics

Size of the dataset

  • Rows : 138047
  • COLS : 57

Name

  • DESCRIPTION : Name of the PE format file.
  • Type : Identifier
  • Distinct values : 138047

md5

  • DESCRIPTION : MD5 hash of the PE format file.
  • Type : Identifier
  • Distinct values : 138047

Machine

  • DESCRIPTION : Type of target machine on which the image file will be able to run.
  • Type : Categorical
  • Distinct values : 3

SizeOfOptionalHeader

  • DESCRIPTION : The size of the optional header, which is required for executable files but not for object files.
  • Type : Numerical
  • Distinct values : 5

Characteristics

  • DESCRIPTION : The flags that indicate the attributes of the file.
  • Type : Categorical
  • Distinct values : 104

MajorLinkerVersion

  • DESCRIPTION : The linker major version number.
  • Type : Categorical
  • Distinct values : 41

MinorLinkerVersion

  • DESCRIPTION : The linker minor version number.
  • Type : Categorical
  • Distinct values : 62

SizeOfCode

  • DESCRIPTION : The size of the code (text) section, or the sum of all code sections.
  • Type : Numerical
  • Distinct values : 3809

SizeOfInitializedData

  • DESCRIPTION : The size of the initialized data section, or the sum of all such sections.
  • Type : Numerical
  • Distinct values : 3217

SizeOfUninitializedData

  • DESCRIPTION : The size of the uninitialized data section (BSS), or the sum of all such sections.
  • Type : Numerical
  • Distinct values : 441

AddressOfEntryPoint

  • DESCRIPTION : The address of the entry point relative to the image base when the executable file is loaded into memory.
  • Type : Categorical
  • Distinct values : 23110

BaseOfCode

  • DESCRIPTION : The address that is relative to the image base of the beginning-of-code section when it is loaded into memory.
  • Type : Categorical
  • Distinct values : 385

BaseOfData

  • DESCRIPTION : The address that is relative to the image base of the beginning-of-data section when it is loaded into memory.
  • Type : Categorical
  • Distinct values : 1106

ImageBase

  • DESCRIPTION : The preferred address of the first byte of image when loaded into memory.
  • Type : Identifier
  • Distinct values : 9099

SectionAlignment

  • DESCRIPTION : The alignment (in bytes) of sections when they are loaded into memory.
  • Type : Numerical
  • Distinct values : 12

FileAlignment

  • DESCRIPTION : The alignment factor (in bytes) that is used to align the raw data of sections in the image file.
  • Type : Numerical
  • Distinct values : 9

MajorOperatingSystemVersion

  • DESCRIPTION : The major version number of the required operating system.
  • Type : Categorical
  • Distinct values : 12

MinorOperatingSystemVersion

  • DESCRIPTION : The minor version number of the required operating system.
  • Type : Categorical
  • Distinct values : 12

MajorImageVersion

  • DESCRIPTION : The major version number of the image.
  • Type : Categorical
  • Distinct values : 38

MinorImageVersion

  • DESCRIPTION : The minor version number of the image.
  • Type : Categorical
  • Distinct values : 70

MajorSubsystemVersion

  • DESCRIPTION : The major version number of the subsystem.
  • Type : Categorical
  • Distinct values : 6

MinorSubsystemVersion

  • DESCRIPTION : The minor version number of the subsystem.
  • Type : Categorical
  • Distinct values : 10

SizeOfImage

  • Description : The size (in bytes) of the image, including all headers, as the image is loaded in memory. It must be a multiple of SectionAlignment.
  • Type : Numerical
  • Distinct values : 2312

SizeOfHeader

  • Description : The combined size of an MS-DOS stub, PE header, and section headers rounded up to a multiple of FileAlignment.
  • Type : Numerical
  • Distinct values : 30

CheckSum

  • Description : The image file checksum. The algorithm for computing the checksum is incorporated into IMAGHELP.DLL. The following are checked for validation at load time: all drivers, any DLL loaded at boot time, and any DLL that is loaded into a critical Windows process.
  • Type : Identifier
  • Distinct values : 81633

SubSystem

  • Description : The subsystem that is required to run this image. For more information, see Windows Subsystem.
  • Type : Categorical
  • Distinct values : 4

DllCharacteristics

  • Description : (Not Found)
  • Type : Categorical
  • Distinct values : 74

SizeOfStackReserve

  • Description : The size of the stack to reserve. Only SizeOfStackCommit is committed; the rest is made available one page at a time until the reserve size is reached.
  • Type : Numerical
  • Distinct values : 40

SizeOfStackCommit

  • Description : The size of the stack to commit.
  • Type : Numerical
  • Distinct values : 40

SizeOfHeapReserve

  • Description : The size of the local heap space to reserve. Only SizeOfHeapCommit is committed; the rest is made available one page at a time until the reserve size is reached.
  • Type : Numerical
  • Distinct values : 30

SizeOfHeapCommit

  • Description : The size of the local heap space to commit.
  • Type : Numerical
  • Distinct values : 21

LoaderFlags

  • Description : Reserved, must be zero.
  • Type : Categorical
  • Distinct values : 15

NumberOfRvaAndSizes

  • Description : The number of data-directory entries in the remainder of the optional header. Each describes a location and size.
  • Type : Numerical
  • Distinct values : 23

SectionsNb

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 28

SectionsMeanEntropy

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 58807

SectionsMinEntropy

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 25505

SectionsMaxEntropy

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 49062

SectionsMeanRawsize

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 9233

SectionsMinRawsize

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 694

SectionsMaxRawsize

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 4796

SectionsMeanVirtualsize

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 36811

SectionsMinVirtualsize

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 6515

SectionsMaxVirtualsize

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 29123

ImportsNbDLL

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 48

ImportsNb

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 954

ImportsNbOrdinal

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 337

ExportNb

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 670

ResourcesNb

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 496

ResourcesMeanEntropy

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 42745

ResourcesMinEntropy

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 17929

ResourcesMaxEntropy

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 23004

ResourcesMeanSize

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 16013

ResourcesMinSize

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 1011

ResourcesMaxSize

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 6150

LoadConfigurationSize

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 39

VersionInformationSize

  • Description : (Not found)
  • Type : Numerical
  • Distinct values : 20

Legitimate

  • Description : Whether the file is legitimate or not.
  • Type : Categorical
  • Distinct values : 2

About

Command line tool for scoring PE format file on PE header's data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%