๐Ÿซถ๐Ÿป U1FAF6-project
  • Welcome to ๐Ÿซถ๐Ÿป U1FAF6-project!
  • ไธ€็จฎๅŠ่‡ชๅ‹•็š„ๆœ€ๅฐๅฏ่กŒๅฏซไฝœๆ–นๆกˆ
  • 20250214 ch2
  • 20250214 ch3
  • Weekly Chellange
  • R Performance Benchmark Results
  • ้—œๆ–ผ COLAB ไฝฟ็”จ็š„ๅฐๆŠ€ๅทง
  • ๆ•™่‚ฒๆธฌ้‡ๅญธๆœŸๅˆŠ่ง€ๅฏŸๆธ…ๅ–ฎ
  • 20260313 DINA pymc
    • ่€ƒ้‡้ปžใ€‚
    • DCM ่ˆ‡ mixture modelingใ€‚
    • MCMC
  • 20260313 r gdina
  • ้•ทๆ ผๅผ (lme4) ๅ’Œๅฏฌๆ ผๅผ (lavaan) ็š„ไผฐ่จˆๆ˜ฏ็›ธๅŒ็š„
๐Ÿซถ๐Ÿป U1FAF6-project
  • 20260313 DINA pymc

Inย [ย ]:
Copied!
pip install "nutpie[pymc]"
pip install "nutpie[pymc]"
Collecting nutpie[pymc]
  Downloading nutpie-0.16.8-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.5 kB)
Requirement already satisfied: pyarrow>=12.0.0 in /usr/local/lib/python3.12/dist-packages (from nutpie[pymc]) (18.1.0)
Collecting arro3-core>=0.6.0 (from nutpie[pymc])
  Downloading arro3_core-0.8.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (515 bytes)
Requirement already satisfied: pandas>=2.0 in /usr/local/lib/python3.12/dist-packages (from nutpie[pymc]) (2.2.2)
Requirement already satisfied: xarray>=2025.1.2 in /usr/local/lib/python3.12/dist-packages (from nutpie[pymc]) (2025.12.0)
Requirement already satisfied: arviz<1.0,>=0.20.0 in /usr/local/lib/python3.12/dist-packages (from nutpie[pymc]) (0.22.0)
Collecting obstore>=0.8.0 (from nutpie[pymc])
  Downloading obstore-0.9.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.9 kB)
Collecting zarr>=3.1.0 (from nutpie[pymc])
  Downloading zarr-3.1.5-py3-none-any.whl.metadata (10 kB)
Requirement already satisfied: pymc>=5.20.1 in /usr/local/lib/python3.12/dist-packages (from nutpie[pymc]) (5.28.1)
Requirement already satisfied: numba>=0.60.0 in /usr/local/lib/python3.12/dist-packages (from nutpie[pymc]) (0.60.0)
Requirement already satisfied: setuptools>=60.0.0 in /usr/local/lib/python3.12/dist-packages (from arviz<1.0,>=0.20.0->nutpie[pymc]) (75.2.0)
Requirement already satisfied: matplotlib>=3.8 in /usr/local/lib/python3.12/dist-packages (from arviz<1.0,>=0.20.0->nutpie[pymc]) (3.10.0)
Requirement already satisfied: numpy>=1.26.0 in /usr/local/lib/python3.12/dist-packages (from arviz<1.0,>=0.20.0->nutpie[pymc]) (2.0.2)
Requirement already satisfied: scipy>=1.11.0 in /usr/local/lib/python3.12/dist-packages (from arviz<1.0,>=0.20.0->nutpie[pymc]) (1.16.3)
Requirement already satisfied: packaging in /usr/local/lib/python3.12/dist-packages (from arviz<1.0,>=0.20.0->nutpie[pymc]) (26.0)
Requirement already satisfied: h5netcdf>=1.0.2 in /usr/local/lib/python3.12/dist-packages (from arviz<1.0,>=0.20.0->nutpie[pymc]) (1.8.1)
Requirement already satisfied: typing-extensions>=4.1.0 in /usr/local/lib/python3.12/dist-packages (from arviz<1.0,>=0.20.0->nutpie[pymc]) (4.15.0)
Requirement already satisfied: xarray-einstats>=0.3 in /usr/local/lib/python3.12/dist-packages (from arviz<1.0,>=0.20.0->nutpie[pymc]) (0.10.0)
Requirement already satisfied: llvmlite<0.44,>=0.43.0dev0 in /usr/local/lib/python3.12/dist-packages (from numba>=0.60.0->nutpie[pymc]) (0.43.0)
Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.12/dist-packages (from pandas>=2.0->nutpie[pymc]) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.12/dist-packages (from pandas>=2.0->nutpie[pymc]) (2025.2)
Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.12/dist-packages (from pandas>=2.0->nutpie[pymc]) (2025.3)
Requirement already satisfied: cachetools<7,>=4.2.1 in /usr/local/lib/python3.12/dist-packages (from pymc>=5.20.1->nutpie[pymc]) (6.2.6)
Requirement already satisfied: cloudpickle in /usr/local/lib/python3.12/dist-packages (from pymc>=5.20.1->nutpie[pymc]) (3.1.2)
Requirement already satisfied: pytensor<2.39,>=2.38.0 in /usr/local/lib/python3.12/dist-packages (from pymc>=5.20.1->nutpie[pymc]) (2.38.2)
Requirement already satisfied: rich>=13.7.1 in /usr/local/lib/python3.12/dist-packages (from pymc>=5.20.1->nutpie[pymc]) (13.9.4)
Requirement already satisfied: threadpoolctl<4.0.0,>=3.1.0 in /usr/local/lib/python3.12/dist-packages (from pymc>=5.20.1->nutpie[pymc]) (3.6.0)
Collecting donfig>=0.8 (from zarr>=3.1.0->nutpie[pymc])
  Downloading donfig-0.8.1.post1-py3-none-any.whl.metadata (5.0 kB)
Requirement already satisfied: google-crc32c>=1.5 in /usr/local/lib/python3.12/dist-packages (from zarr>=3.1.0->nutpie[pymc]) (1.8.0)
Collecting numcodecs>=0.14 (from zarr>=3.1.0->nutpie[pymc])
  Downloading numcodecs-0.16.5-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (3.4 kB)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.12/dist-packages (from donfig>=0.8->zarr>=3.1.0->nutpie[pymc]) (6.0.3)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib>=3.8->arviz<1.0,>=0.20.0->nutpie[pymc]) (1.3.3)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.12/dist-packages (from matplotlib>=3.8->arviz<1.0,>=0.20.0->nutpie[pymc]) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.12/dist-packages (from matplotlib>=3.8->arviz<1.0,>=0.20.0->nutpie[pymc]) (4.62.0)
Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib>=3.8->arviz<1.0,>=0.20.0->nutpie[pymc]) (1.5.0)
Requirement already satisfied: pillow>=8 in /usr/local/lib/python3.12/dist-packages (from matplotlib>=3.8->arviz<1.0,>=0.20.0->nutpie[pymc]) (11.3.0)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib>=3.8->arviz<1.0,>=0.20.0->nutpie[pymc]) (3.3.2)
Requirement already satisfied: filelock>=3.15 in /usr/local/lib/python3.12/dist-packages (from pytensor<2.39,>=2.38.0->pymc>=5.20.1->nutpie[pymc]) (3.25.1)
Requirement already satisfied: etuples in /usr/local/lib/python3.12/dist-packages (from pytensor<2.39,>=2.38.0->pymc>=5.20.1->nutpie[pymc]) (0.3.10)
Requirement already satisfied: logical-unification in /usr/local/lib/python3.12/dist-packages (from pytensor<2.39,>=2.38.0->pymc>=5.20.1->nutpie[pymc]) (0.4.7)
Requirement already satisfied: miniKanren in /usr/local/lib/python3.12/dist-packages (from pytensor<2.39,>=2.38.0->pymc>=5.20.1->nutpie[pymc]) (1.0.5)
Requirement already satisfied: cons in /usr/local/lib/python3.12/dist-packages (from pytensor<2.39,>=2.38.0->pymc>=5.20.1->nutpie[pymc]) (0.4.7)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.8.2->pandas>=2.0->nutpie[pymc]) (1.17.0)
Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.12/dist-packages (from rich>=13.7.1->pymc>=5.20.1->nutpie[pymc]) (4.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.12/dist-packages (from rich>=13.7.1->pymc>=5.20.1->nutpie[pymc]) (2.19.2)
Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.12/dist-packages (from markdown-it-py>=2.2.0->rich>=13.7.1->pymc>=5.20.1->nutpie[pymc]) (0.1.2)
Requirement already satisfied: toolz in /usr/local/lib/python3.12/dist-packages (from logical-unification->pytensor<2.39,>=2.38.0->pymc>=5.20.1->nutpie[pymc]) (0.12.1)
Requirement already satisfied: multipledispatch in /usr/local/lib/python3.12/dist-packages (from logical-unification->pytensor<2.39,>=2.38.0->pymc>=5.20.1->nutpie[pymc]) (1.0.0)
Downloading arro3_core-0.8.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)
   โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 3.0/3.0 MB 31.6 MB/s eta 0:00:00
Downloading obstore-0.9.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.2 MB)
   โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 4.2/4.2 MB 80.0 MB/s eta 0:00:00
Downloading zarr-3.1.5-py3-none-any.whl (284 kB)
   โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 284.1/284.1 kB 9.7 MB/s eta 0:00:00
Downloading nutpie-0.16.8-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.7 MB)
   โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 8.7/8.7 MB 64.5 MB/s eta 0:00:00
Downloading donfig-0.8.1.post1-py3-none-any.whl (21 kB)
Downloading numcodecs-0.16.5-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (9.2 MB)
   โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 9.2/9.2 MB 36.3 MB/s eta 0:00:00
Installing collected packages: obstore, numcodecs, donfig, arro3-core, zarr, nutpie
Successfully installed arro3-core-0.8.0 donfig-0.8.1.post1 numcodecs-0.16.5 nutpie-0.16.8 obstore-0.9.2 zarr-3.1.5

DINA model in PyMCยถ

  • 2026-03-13
  • jw

ๆœฌๆ–‡ๅ…ˆ้€€่€Œไปฅ DINA model ๅฏฆๅšใ€‚

  • DINA (Deterministic Input, Noisy AND gate). ๅ‡่จญ่ฉฒ้กŒ็›ฎ้œ€่ฆๆ‰€ๅฐๆ‡‰็š„ๆ‰€ๆœ‰ๅฑฌๆ€ง้ƒฝๆŽŒๆก (AND)๏ผŒๆ‰่ƒฝ็ญ”ๅฐใ€‚๏ผˆๅฑฌๆ€ง็š„ไน˜ๆณ•้—œไฟ‚๏ผŒไบคไบ’ๆ•ˆๆžœ้ƒจไปฝใ€‚๏ผ‰้ฉๅˆ่ช็Ÿฅๆธฌ้ฉ—ใ€‚
  • DINO (Deterministic Input, Noisy OR gate). ๅ‡่จญ่ฉฒ้กŒ็›ฎ้œ€ๆ‰€ๅฐๆ‡‰็š„ๅฑฌๆ€งๆŽŒๆกไธ€็จฎ๏ผŒๅฐฑ่ƒฝ็ญ”ๅฐ (OR)ใ€‚๏ผˆๅฑฌๆ€ง็š„ๅŠ ๆณ•้—œไฟ‚๏ผŒไธปๆ•ˆๆžœ้ƒจไปฝใ€‚๏ผ‰้ฉๅˆ้ž่ช็Ÿฅๆธฌ้ฉ—ใ€‚
  • GDINA. ็ดๅ…ฅๆ‰€ๆœ‰ๅฏ่ƒฝ็š„ไธปๆ•ˆๆžœๅ’Œไบคไบ’ๆ•ˆๆžœ็š„้ฃฝๅ’Œๆจกๅผใ€‚่ฎ“่ณ‡ๆ–™ๆฑบๅฎšๅ“ชไบ›ๆ•ˆๆžœ้‡่ฆ๏ผŒไธ้ ่จญ็ตๆง‹ใ€‚

่€ƒ้‡้ปžใ€‚ยถ

GDINA ้›–็„ถๆฏ”่ผƒๅฝˆๆ€ง๏ผŒไฝ†ๆฏๅ€‹้กŒ็›ฎ็š„ๅƒๆ•ธ้‡ไนŸ่ผƒๅคšใ€‚ๅ› ๆญค่€ƒ้‡ๅˆฐๅƒๆ•ธๆ•ธ้‡ๅ’Œๆจฃๆœฌๆ•ธ้‡็š„ๆฌŠ่กก๏ผŒ่ฆไฝฟ็”จ้ฃฝๅ’Œๆจกๅผ (GDINA, LCDM, ...) ้‚„ๆ˜ฏ็ธฎๆธ›ๆจกๅผ(DINA, DINO, ...) ้œ€่ฆ็ ”็ฉถ่€…้ธๆ“‡ใ€‚ ไปฅ GDINA ่€Œ่จ€๏ผŒๆฏๅ€‹้กŒ็›ฎ้œ€่ฆ $2^{K_j}$ ๅ€‹ๅƒๆ•ธ๏ผŒ${K_j}$ ๆ˜ฏ็œ‹ๆŸ้กŒ็›ฎ $j$ ๅฐๆ‡‰ๅนพๅ€‹ๅฑฌๆ€ง $K$. ่€Œ DINA ๆจกๅผๆฏๅ€‹้กŒ็›ฎๅฐฑๅฐๆ‡‰2ๅ€‹ๅƒๆ•ธ (slip, guess)ใ€‚

ไปฅ 20 ้กŒ๏ผŒ5 ๅ€‹ๅฑฌๆ€ง็‚บไพ‹๏ผŒ

  • DINA. $20 \times 2 = 40$ ๅ€‹ๅƒๆ•ธใ€‚
  • GDINA. ๆœ€ไฝŽๆƒ…ๆณๆฏ้กŒๅชๅฐๆ‡‰ 1 ๅ€‹ๅฑฌๆ€ง๏ผˆไฝ†้€™ๆœƒๆœ‰่ญ˜ๅˆฅๆ€งๅ•้กŒ๏ผ‰๏ผŒ$20 \times 2^1 = 40$ ๅ€‹ๅƒๆ•ธ๏ผŒๆœ€้ซ˜ๆƒ…ๆณๆฏ้กŒๅฐๆ‡‰ 5 ๅ€‹ๅฑฌๆ€ง๏ผˆ้€™ไนŸๆœ‰่ญ˜ๅˆฅๆ€งๅ•้กŒ๏ผ‰๏ผŒ$20 \times 2^5 = 640$ ๅ€‹ๅƒๆ•ธใ€‚ๆญฃๅธธๆƒ…ๆณๆ‡‰่ฉฒไป‹ๆ–ผ 40--640 ๅ€‹ๅƒๆ•ธไน‹้–“ใ€‚ไป็„ถๆ˜ฏ้žๅธธๅคง็š„ๅƒๆ•ธ้‡ใ€‚ๅœจ Q-matrix ่จญ่จˆๅพ—็•ถ็š„ๆƒ…ๆณไธ‹๏ผŒไนŸ้œ€่ฆๆœ‰่ถณๅค ็š„ๆจฃๆœฌๆ‰่ƒฝไผฐ่จˆๅพ—ๅฅฝใ€‚

ๅฆๅค–๏ผŒ GDINA ็š„ไผฐ่จˆๆ–นๆณ•ไนŸๆ›ด็‚บ่ค‡้›œใ€‚DINA ๆ˜ฏ็›ธๅฐๅฎนๆ˜“ไธ”็ฉฉๅฎš็š„้ธ้ …ไน‹ไธ€ใ€‚

DCM ่ˆ‡ mixture modelingใ€‚ยถ

่จบๆ–ทๅˆ†้กžๆจกๅผ (diagnostic classification model, DCM) (้€™้‚Šๆ‰€่ชช็š„ DINA, DINO, GDINA ้ƒฝๅฑฌๆ–ผๆญค้กž) ๅฏ่ฆ–็‚บไธ€็จฎใ€Œ้ฉ—่ญ‰ๆ€งใ€๏ผˆๆˆ–่ชช้™ๅˆถๆ€ง๏ผ‰็š„ๅˆ†้กžๆจกๅผ๏ผŒไนŸๅฐฑๆ˜ฏ่ชช่ฆๅˆ†ๅนพ้กžๆ˜ฏ็ขบๅฎšๅฅฝ็š„ใ€‚ๅณ $2^K$ ้กž๏ผŒK ๆ˜ฏๅฑฌๆ€งๆ•ธ้‡ใ€‚ไพ‹ๅฆ‚ 3 ๅ€‹ๅฑฌๆ€ง๏ผŒๅฐฑๆœ‰ $2^3=8$ ๅ€‹็ฒพ็†Ÿ็ต„ๅž‹๏ผŒๅพž {0,0,0}, {0,0,1},...,{1,1,1}, ๅชๆ˜ฏๆœ‰ไบ›็ต„ๅฏ่ƒฝๆฒ’ๆœ‰ไบบใ€‚่ฉณ่ฆ‹ๆœฌๅˆ†ๆ–‡ไปถ็š„ๆœ€ๅพŒใ€‚

็›ธ่ผƒๆ–ผใ€ŒๆŽข็ดขๆ€งใ€็š„ latent class analysis (LCA)(ๅฏ่ฆ–็‚บๅ›บๅฎšๆฝ›ๅœจ่ฎŠ็•ฐๆ•ธ็‚บ 0 ็š„ GMM), ๆˆ–ๆ˜ฏ k-means (็”จ่ท้›ขๆฑบๅฎš็š„) ๏ผŒๆˆ–ๅ…ถไป–ๅ„ๅผ mixture model๏ผˆไพ‹ๅฆ‚ Gaussian mixture model๏ผ‰๏ผŒ่ฆๅนพ้กžๅพ€ๅพ€ๆ˜ฏใ€ŒๆŽข็ดขใ€ๅ‡บไพ†็š„ใ€‚ไนŸๅฐฑๆ˜ฏ็ ”็ฉถ่€…้€้Žๆจกๅผๆฏ”่ผƒ๏ผŒๆฑบๅฎš่ฆๅˆ†ๆˆๅนพ้กžใ€‚

ๅ› ๆญค๏ผŒDCM ๆœ‰ๅˆฅๆ–ผๅ…ถไป–็š„ๅˆ†้กžๆจกๅผใ€‚ไธ€่ˆฌ็š„ๅˆ†้กžๆจกๅผๅชๆ˜ฏๅ–ฎ็ด”ๆ นๆ“š่ฎŠ้ …ๆŠŠไบบๅšๅˆ†้กž๏ผŒไฝ† DCM ๆ›ด้‡่ฆ็š„็›ฎๆจ™ๆ˜ฏๆŠŠไบบๅˆ†้€ฒใ€Œ่จญๅฎšๅฅฝ็š„็ฒพ็†Ÿ้กžๅž‹ใ€ไธญ๏ผŒ่—‰ๆญค็Ÿฅ้“ๆฏไฝๅญธ็”Ÿๅœจ้€™ไบ›ๅฑฌๆ€ง๏ผˆๆˆ–ๆŠ€่ƒฝ็ญ‰็ญ‰๏ผ‰ๆ˜ฏๅฆ็ฒพ็†Ÿ/ไธ็ฒพ็†Ÿใ€‚้”ๅˆฐ่ช็Ÿฅ่จบๆ–ท็š„ๆ•ˆๆžœใ€‚

MCMCยถ

ไนŸๅฏไปฅไฝฟ็”จ EM ไผฐ่จˆใ€‚ไฝ†้œ€่ฆๅฆๅค–ๆชข้ฉ—ใ€‚

ๅœจ่ฒๆฐ็š„ๆžถๆง‹ไธ‹ๅฏไปฅ็›ด่ฆบ็š„ๅฏซๅ‡บ็ตฑ่จˆๆจกๅผ๏ผŒไธฆ้€้Ž่‡ชๅ‹•ๆŠฝๆจฃ็š„ๆ–นๅผๅพ—ๅˆฐๅƒๆ•ธไผฐ่จˆใ€‚ๅ› ็‚บ DCM ๆถ‰ๅŠๅˆฐ้›ขๆ•ฃ็š„ๆฝ›ๅœจ่ฎŠๆ•ธ๏ผŒๅ› ๆญค้ŽๅŽป้€šๅธธ็”จ Gibbs sampler ไผฐ่จˆ ๏ผˆไพ‹ๅฆ‚ jags, ่ฆ‹ไธ‹ๆ–นๆ–‡็ป๏ผ‰

ไฝ†่€ƒ้‡ๅˆฐๅœจ python ไธŠๅฏฆๅš๏ผŒๆฏ”่ผƒๆˆ็†Ÿ็š„่ฒๆฐ็ตฑ่จˆ่ปŸ้ซ”็‚บ PyMC . ไฝ† PyMC ๆ˜ฏ็”จ HMC/NUTS ๆŠฝๆจฃๅ™จ๏ผŒ่ฉฒๆ–นๆณ•ๆฏ”่ผƒๅฟซ๏ผŒไฝ†ๅฐๆ–ผ้›ขๆ•ฃๆฝ›ๅœจ่ฎŠๆ•ธไธๅ‹ๅ–„ใ€‚ๅ› ๆญค้œ€่ฆ็‰นๆฎŠ่™•็† ๏ผˆlogsumexp๏ผ‰๏ผˆ่ฆ‹ไธ‹๏ผ‰ใ€‚

Refยถ

(Bayesian-Gibbs)

  • Zhan, P., Jiao, H., Man, K., & Wang, L. (2019). Using JAGS for Bayesian cognitive diagnosis modeling: A tutorial. Journal of Educational and Behavioral Statistics, 44(4), 473-503. LINK

(EM-DINA)

  • De La Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of educational and behavioral statistics, 34(1), 115-130. LINK

(EM-GDINA)

  • De La Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76(2), 179-199. LINK
Inย [ย ]:
Copied!
"""
DINA + PyMC ่ฒๆฐไผฐ่จˆ
========================
ๆจกๅž‹๏ผšDeterministic Input, Noisy And Gate (DINA)
  - ๆฏ้กŒๅ…ฉๅ€‹ๅƒๆ•ธ๏ผšslip (s_j) ่ˆ‡ guess (g_j)
  - ็†ๆƒณๅๆ‡‰๏ผšฮท_ij = โˆ_k ฮฑ_ik^q_jk  ๏ผˆๅฟ…้ ˆๆŽŒๆกๆ‰€ๆœ‰ๅฟ…่ฆๅฑฌๆ€ง๏ผ‰
  - P(Y_ij=1 | ฮฑ_i) = (1 - s_j)^ฮท_ij * g_j^(1 - ฮท_ij)
ๆจกๆ“ฌ่ณ‡ๆ–™๏ผš500ไบบใ€20้กŒใ€3ๅฑฌๆ€ง
"""

import numpy as np
import pandas as pd
import itertools
import warnings
warnings.filterwarnings("ignore")

import pymc as pm
import pytensor.tensor as pt
from scipy.special import logsumexp
from scipy.stats import pearsonr

import nutpie
""" DINA + PyMC ่ฒๆฐไผฐ่จˆ ======================== ๆจกๅž‹๏ผšDeterministic Input, Noisy And Gate (DINA) - ๆฏ้กŒๅ…ฉๅ€‹ๅƒๆ•ธ๏ผšslip (s_j) ่ˆ‡ guess (g_j) - ็†ๆƒณๅๆ‡‰๏ผšฮท_ij = โˆ_k ฮฑ_ik^q_jk ๏ผˆๅฟ…้ ˆๆŽŒๆกๆ‰€ๆœ‰ๅฟ…่ฆๅฑฌๆ€ง๏ผ‰ - P(Y_ij=1 | ฮฑ_i) = (1 - s_j)^ฮท_ij * g_j^(1 - ฮท_ij) ๆจกๆ“ฌ่ณ‡ๆ–™๏ผš500ไบบใ€20้กŒใ€3ๅฑฌๆ€ง """ import numpy as np import pandas as pd import itertools import warnings warnings.filterwarnings("ignore") import pymc as pm import pytensor.tensor as pt from scipy.special import logsumexp from scipy.stats import pearsonr import nutpie
Inย [ย ]:
Copied!
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# PART 1๏ผšๆจกๆ“ฌ่ณ‡ๆ–™
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•

np.random.seed(1234)

N, J, K = 1500, 20, 3
all_alpha = np.array(list(itertools.product([0, 1], repeat=K)))  # (8, 3)
L = len(all_alpha)  # 2^K = 8 ็จฎๅฑฌๆ€ง็ต„ๅˆ

# Q ็Ÿฉ้™ฃ๏ผšๆŒ‡ๅฎšๆฏ้กŒๆธฌ้‡ๅ“ชไบ›ๅฑฌๆ€ง
Q = np.array([
    [1, 0, 0], [1, 0, 0], [1, 0, 0], [1, 0, 0],   # ้กŒ 1-4๏ผšๅƒ…ๆธฌๅฑฌๆ€ง1
    [0, 1, 0], [0, 1, 0], [0, 1, 0], [0, 1, 0],   # ้กŒ 5-8๏ผšๅƒ…ๆธฌๅฑฌๆ€ง2
    [0, 0, 1], [0, 0, 1], [0, 0, 1], [0, 0, 1],   # ้กŒ 9-12๏ผšๅƒ…ๆธฌๅฑฌๆ€ง3
    [1, 1, 0], [1, 1, 0], [1, 1, 0],               # ้กŒ 13-15๏ผšๆธฌๅฑฌๆ€ง1+2
    [1, 0, 1], [1, 0, 1], [1, 0, 1],               # ้กŒ 16-18๏ผšๆธฌๅฑฌๆ€ง1+3
    [1, 1, 1], [1, 1, 1],                           # ้กŒ 19-20๏ผšๆธฌๅ…จ้ƒจๅฑฌๆ€ง
])

# ๅฑฌๆ€ง็ต„ๅˆ็š„ๅ…ˆ้ฉ—ๆฏ”ไพ‹
pi_true = np.array([0.05, 0.08, 0.08, 0.12, 0.10, 0.12, 0.15, 0.30])

# DINA ็œŸๅฏฆๅƒๆ•ธ๏ผšslip ่ˆ‡ guess
# slip๏ผšๆŽŒๆกๆ‰€ๆœ‰ๅฑฌๆ€งไฝ†็ญ”้Œฏ็š„ๆฆ‚็އ๏ผ›guess๏ผšๆœชๆŽŒๆกๅฑฌๆ€งไฝ†็ญ”ๅฐ็š„ๆฆ‚็އ
slip_true  = np.array([0.10, 0.15, 0.10, 0.15,   # ้กŒ 1-4
                        0.10, 0.15, 0.10, 0.15,   # ้กŒ 5-8
                        0.10, 0.15, 0.10, 0.15,   # ้กŒ 9-12
                        0.10, 0.15, 0.10,           # ้กŒ 13-15
                        0.10, 0.15, 0.10,           # ้กŒ 16-18
                        0.10, 0.15])               # ้กŒ 19-20

guess_true = np.array([0.15, 0.20, 0.10, 0.25,   # ้กŒ 1-4
                        0.15, 0.20, 0.10, 0.25,   # ้กŒ 5-8
                        0.15, 0.20, 0.10, 0.25,   # ้กŒ 9-12
                        0.10, 0.15, 0.10,           # ้กŒ 13-15
                        0.10, 0.15, 0.10,           # ้กŒ 16-18
                        0.10, 0.15])               # ้กŒ 19-20

# ่จˆ็ฎ—ๅ„ๅฑฌๆ€ง็ต„ๅˆไธ‹๏ผŒๆฏ้กŒ็š„็†ๆƒณๅๆ‡‰ ฮท_lj = โˆ_k ฮฑ_lk^q_jk
# eta_true[l, j] = 1 iff ๅฑฌๆ€ง็ต„ๅˆ l ๆŽŒๆก้กŒ j ็š„ๆ‰€ๆœ‰ๅฟ…่ฆๅฑฌๆ€ง
eta_true = np.array([
    [int(np.all(all_alpha[l] >= Q[j])) for j in range(J)]
    for l in range(L)
])  # (L, J)

# DINA ๆˆๅŠŸๆฆ‚็އ๏ผšP(Y=1|ฮฑ_l) = (1-s_j)^ฮท_lj * g_j^(1-ฮท_lj)
P_true = np.zeros((J, L))
for j in range(J):
    for l in range(L):
        if eta_true[l, j] == 1:
            P_true[j, l] = 1 - slip_true[j]    # ๆŽŒๆก โ†’ ๆˆๅŠŸๆฆ‚็އ้ซ˜
        else:
            P_true[j, l] = guess_true[j]        # ๆœชๆŽŒๆก โ†’ ็Œœๅฐๆฆ‚็އไฝŽ

# ๆจกๆ“ฌๅ—่ฉฆ่€…ๅฑฌๆ€ง็ต„ๅˆ
alpha_idx  = np.random.choice(L, size=N, p=pi_true)
Alpha_true = all_alpha[alpha_idx]

# ๆจกๆ“ฌไฝœ็ญ”่ณ‡ๆ–™
Y = np.array([
    [np.random.binomial(1, P_true[j, alpha_idx[i]]) for j in range(J)]
    for i in range(N)
])

print(f"ๆจกๆ“ฌ่ณ‡ๆ–™ๅฎŒๆˆ๏ผšY={Y.shape}, Q={Q.shape}")
print(f"slip ็ฏ„ๅœ๏ผš{slip_true.min():.2f} ~ {slip_true.max():.2f}")
print(f"guess ็ฏ„ๅœ๏ผš{guess_true.min():.2f} ~ {guess_true.max():.2f}")
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• # PART 1๏ผšๆจกๆ“ฌ่ณ‡ๆ–™ # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• np.random.seed(1234) N, J, K = 1500, 20, 3 all_alpha = np.array(list(itertools.product([0, 1], repeat=K))) # (8, 3) L = len(all_alpha) # 2^K = 8 ็จฎๅฑฌๆ€ง็ต„ๅˆ # Q ็Ÿฉ้™ฃ๏ผšๆŒ‡ๅฎšๆฏ้กŒๆธฌ้‡ๅ“ชไบ›ๅฑฌๆ€ง Q = np.array([ [1, 0, 0], [1, 0, 0], [1, 0, 0], [1, 0, 0], # ้กŒ 1-4๏ผšๅƒ…ๆธฌๅฑฌๆ€ง1 [0, 1, 0], [0, 1, 0], [0, 1, 0], [0, 1, 0], # ้กŒ 5-8๏ผšๅƒ…ๆธฌๅฑฌๆ€ง2 [0, 0, 1], [0, 0, 1], [0, 0, 1], [0, 0, 1], # ้กŒ 9-12๏ผšๅƒ…ๆธฌๅฑฌๆ€ง3 [1, 1, 0], [1, 1, 0], [1, 1, 0], # ้กŒ 13-15๏ผšๆธฌๅฑฌๆ€ง1+2 [1, 0, 1], [1, 0, 1], [1, 0, 1], # ้กŒ 16-18๏ผšๆธฌๅฑฌๆ€ง1+3 [1, 1, 1], [1, 1, 1], # ้กŒ 19-20๏ผšๆธฌๅ…จ้ƒจๅฑฌๆ€ง ]) # ๅฑฌๆ€ง็ต„ๅˆ็š„ๅ…ˆ้ฉ—ๆฏ”ไพ‹ pi_true = np.array([0.05, 0.08, 0.08, 0.12, 0.10, 0.12, 0.15, 0.30]) # DINA ็œŸๅฏฆๅƒๆ•ธ๏ผšslip ่ˆ‡ guess # slip๏ผšๆŽŒๆกๆ‰€ๆœ‰ๅฑฌๆ€งไฝ†็ญ”้Œฏ็š„ๆฆ‚็އ๏ผ›guess๏ผšๆœชๆŽŒๆกๅฑฌๆ€งไฝ†็ญ”ๅฐ็š„ๆฆ‚็އ slip_true = np.array([0.10, 0.15, 0.10, 0.15, # ้กŒ 1-4 0.10, 0.15, 0.10, 0.15, # ้กŒ 5-8 0.10, 0.15, 0.10, 0.15, # ้กŒ 9-12 0.10, 0.15, 0.10, # ้กŒ 13-15 0.10, 0.15, 0.10, # ้กŒ 16-18 0.10, 0.15]) # ้กŒ 19-20 guess_true = np.array([0.15, 0.20, 0.10, 0.25, # ้กŒ 1-4 0.15, 0.20, 0.10, 0.25, # ้กŒ 5-8 0.15, 0.20, 0.10, 0.25, # ้กŒ 9-12 0.10, 0.15, 0.10, # ้กŒ 13-15 0.10, 0.15, 0.10, # ้กŒ 16-18 0.10, 0.15]) # ้กŒ 19-20 # ่จˆ็ฎ—ๅ„ๅฑฌๆ€ง็ต„ๅˆไธ‹๏ผŒๆฏ้กŒ็š„็†ๆƒณๅๆ‡‰ ฮท_lj = โˆ_k ฮฑ_lk^q_jk # eta_true[l, j] = 1 iff ๅฑฌๆ€ง็ต„ๅˆ l ๆŽŒๆก้กŒ j ็š„ๆ‰€ๆœ‰ๅฟ…่ฆๅฑฌๆ€ง eta_true = np.array([ [int(np.all(all_alpha[l] >= Q[j])) for j in range(J)] for l in range(L) ]) # (L, J) # DINA ๆˆๅŠŸๆฆ‚็އ๏ผšP(Y=1|ฮฑ_l) = (1-s_j)^ฮท_lj * g_j^(1-ฮท_lj) P_true = np.zeros((J, L)) for j in range(J): for l in range(L): if eta_true[l, j] == 1: P_true[j, l] = 1 - slip_true[j] # ๆŽŒๆก โ†’ ๆˆๅŠŸๆฆ‚็އ้ซ˜ else: P_true[j, l] = guess_true[j] # ๆœชๆŽŒๆก โ†’ ็Œœๅฐๆฆ‚็އไฝŽ # ๆจกๆ“ฌๅ—่ฉฆ่€…ๅฑฌๆ€ง็ต„ๅˆ alpha_idx = np.random.choice(L, size=N, p=pi_true) Alpha_true = all_alpha[alpha_idx] # ๆจกๆ“ฌไฝœ็ญ”่ณ‡ๆ–™ Y = np.array([ [np.random.binomial(1, P_true[j, alpha_idx[i]]) for j in range(J)] for i in range(N) ]) print(f"ๆจกๆ“ฌ่ณ‡ๆ–™ๅฎŒๆˆ๏ผšY={Y.shape}, Q={Q.shape}") print(f"slip ็ฏ„ๅœ๏ผš{slip_true.min():.2f} ~ {slip_true.max():.2f}") print(f"guess ็ฏ„ๅœ๏ผš{guess_true.min():.2f} ~ {guess_true.max():.2f}")
ๆจกๆ“ฌ่ณ‡ๆ–™ๅฎŒๆˆ๏ผšY=(1500, 20), Q=(20, 3)
slip ็ฏ„ๅœ๏ผš0.10 ~ 0.15
guess ็ฏ„ๅœ๏ผš0.10 ~ 0.25
Inย [ย ]:
Copied!
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# PART 2๏ผš้ ่จˆ็ฎ— ฮท ็Ÿฉ้™ฃ๏ผˆPyMC ๅค–้ƒจ๏ผ‰
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# ฮท_lj = 1 iff ๅฑฌๆ€ง็ต„ๅˆ l ๆŽŒๆกไบ†้กŒ j ็š„ๆ‰€ๆœ‰ๅฟ…่ฆๅฑฌๆ€ง
# ๆญค็‚บๅ›บๅฎš็š„็ตๆง‹็Ÿฉ้™ฃ๏ผŒ็”ฑ Q ็Ÿฉ้™ฃๆฑบๅฎš๏ผŒไธ้œ€่ฆไผฐ่จˆ

# eta[l, j]: ๅฑฌๆ€ง็ต„ๅˆ l ๅฐ้กŒ j ็š„็†ๆƒณๅๆ‡‰
eta = np.array([
    [int(np.all(all_alpha[l] >= Q[j])) for j in range(J)]
    for l in range(L)
], dtype=np.float32)  # (L, J)

print("ฮท ็Ÿฉ้™ฃ๏ผˆๅ‰ 4 ้กŒ ร— ๅ…จ้ƒจ 8 ๅ€‹ๅฑฌๆ€ง็ต„ๅˆ๏ผ‰๏ผš")
df_eta = pd.DataFrame(eta[:, :4],
                       index=["".join(map(str, a)) for a in all_alpha],
                       columns=[f"Q{j+1}" for j in range(4)])
print(df_eta.to_string())
print("\n๏ผˆฮท=1 ่กจ็คบ่ฉฒๅฑฌๆ€ง็ต„ๅˆๆŽŒๆกไบ†้กŒ็›ฎๆ‰€ๆœ‰ๅฟ…่ฆๅฑฌๆ€ง๏ผ‰")
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• # PART 2๏ผš้ ่จˆ็ฎ— ฮท ็Ÿฉ้™ฃ๏ผˆPyMC ๅค–้ƒจ๏ผ‰ # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• # ฮท_lj = 1 iff ๅฑฌๆ€ง็ต„ๅˆ l ๆŽŒๆกไบ†้กŒ j ็š„ๆ‰€ๆœ‰ๅฟ…่ฆๅฑฌๆ€ง # ๆญค็‚บๅ›บๅฎš็š„็ตๆง‹็Ÿฉ้™ฃ๏ผŒ็”ฑ Q ็Ÿฉ้™ฃๆฑบๅฎš๏ผŒไธ้œ€่ฆไผฐ่จˆ # eta[l, j]: ๅฑฌๆ€ง็ต„ๅˆ l ๅฐ้กŒ j ็š„็†ๆƒณๅๆ‡‰ eta = np.array([ [int(np.all(all_alpha[l] >= Q[j])) for j in range(J)] for l in range(L) ], dtype=np.float32) # (L, J) print("ฮท ็Ÿฉ้™ฃ๏ผˆๅ‰ 4 ้กŒ ร— ๅ…จ้ƒจ 8 ๅ€‹ๅฑฌๆ€ง็ต„ๅˆ๏ผ‰๏ผš") df_eta = pd.DataFrame(eta[:, :4], index=["".join(map(str, a)) for a in all_alpha], columns=[f"Q{j+1}" for j in range(4)]) print(df_eta.to_string()) print("\n๏ผˆฮท=1 ่กจ็คบ่ฉฒๅฑฌๆ€ง็ต„ๅˆๆŽŒๆกไบ†้กŒ็›ฎๆ‰€ๆœ‰ๅฟ…่ฆๅฑฌๆ€ง๏ผ‰")
ฮท ็Ÿฉ้™ฃ๏ผˆๅ‰ 4 ้กŒ ร— ๅ…จ้ƒจ 8 ๅ€‹ๅฑฌๆ€ง็ต„ๅˆ๏ผ‰๏ผš
      Q1   Q2   Q3   Q4
000  0.0  0.0  0.0  0.0
001  0.0  0.0  0.0  0.0
010  0.0  0.0  0.0  0.0
011  0.0  0.0  0.0  0.0
100  1.0  1.0  1.0  1.0
101  1.0  1.0  1.0  1.0
110  1.0  1.0  1.0  1.0
111  1.0  1.0  1.0  1.0

๏ผˆฮท=1 ่กจ็คบ่ฉฒๅฑฌๆ€ง็ต„ๅˆๆŽŒๆกไบ†้กŒ็›ฎๆ‰€ๆœ‰ๅฟ…่ฆๅฑฌๆ€ง๏ผ‰

้€™้‚Š็”จ VI ไผฐ่จˆๆฏ”่ผƒๅฟซ๏ผŒไฝ†ๆ˜ฏๅƒๆ•ธๅ›žๅพฉๆ€งๆฏ”่ผƒไธๅฅฝใ€‚ ็”จ MCMC ไผฐ่จˆๆฏ”่ผƒๆ…ข๏ผˆๅœจ colab ไธŠ็ด„้ ˆ่ฆ 1-3 ๅˆ†้˜๏ผŒๅˆฐๆœฌๅœฐไธŠๅฏ่ƒฝๅฏๅŠ ้€Ÿ๏ผ‰๏ผŒไฝ†ๅƒๆ•ธๅ›žๅพฉๆ€ง่ผƒๅฅฝใ€‚

Inย [ย ]:
Copied!
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# PART 3๏ผšPyMC DINA๏ผˆMCMC ไผฐ่จˆ๏ผ‰
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
#
# DINA ๆจกๅž‹๏ผš
#   P(Y_ij=1 | ฮฑ_i) = (1 - s_j)^ฮท_ij * g_j^(1 - ฮท_ij)
#
# ๅ…ˆ้ฉ—่จญๅฎš๏ผˆๅƒ่€ƒ่ซ–ๆ–‡ de la Torre 2009; Junker & Sijtsma 2001๏ผ‰๏ผš
#   slip_j ~ Beta(1, 4)   โ†’ ๅๅ‘ๅฐๅ€ผ๏ผˆslip ้€šๅธธ่ผƒๅฐ๏ผ‰
#   guess_j ~ Beta(1, 4)  โ†’ ๅๅ‘ๅฐๅ€ผ๏ผˆguess ้€šๅธธ่ผƒๅฐ๏ผ‰
#   ฯ€ ~ Dirichlet(1, ..., 1)  โ†’ ๅ‡ๅ‹ปๅ…ˆ้ฉ—
#
# ๅฏ่ญ˜ๅˆฅๆ€ง็ด„ๆŸ๏ผšs_j + g_j < 1  ๏ผˆ้€้Žๅ…ˆ้ฉ—ๅฝข็‹€้šฑๅผๆŽงๅˆถ๏ผ‰

eta_pt = pt.constant(eta)  # (L, J)
Y_pt   = pt.constant(Y.astype(np.float32))  # (N, J)

with pm.Model() as dina_model:

    # โ”€โ”€ ๅ…ˆ้ฉ— โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
    # ๅฑฌๆ€ง็ต„ๅˆ็š„ๆททๅˆๆฏ”ไพ‹
    pi = pm.Dirichlet("pi", a=np.ones(L))  # (L,)

    # ๆฏ้กŒ็š„ slip ่ˆ‡ guess ๅƒๆ•ธ
    # Beta(1,4) ๅ…ˆ้ฉ—่ฎ“ๅƒๆ•ธๅๅ‘ 0๏ผŒ็ฌฆๅˆ DINA ็š„่ญ˜ๅˆฅๆขไปถ
    slip  = pm.Beta("slip",  alpha=1, beta=4, shape=(J,))  # (J,)
    guess = pm.Beta("guess", alpha=1, beta=4, shape=(J,))  # (J,)

    # โ”€โ”€ DINA ๆˆๅŠŸๆฆ‚็އ็Ÿฉ้™ฃ P_mat (J, L) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
    # P_mat[j, l] = (1 - slip[j])^ฮท[l,j] * guess[j]^(1 - ฮท[l,j])
    # ๅˆฉ็”จ log-sum trick ่จˆ็ฎ—
    #   log P = ฮท[l,j] * log(1-slip[j]) + (1-ฮท[l,j]) * log(guess[j])

    log_1_minus_slip = pt.log(1 - slip + 1e-8)  # (J,)
    log_guess        = pt.log(guess + 1e-8)      # (J,)

    # eta_pt: (L, J) โ†’ ่ฝ‰็ฝฎๆˆ (J, L) ไปฅไพฟๅปฃๆ’ญ
    eta_T = eta_pt.T  # (J, L)

    # log P_mat[j, l] = ฮท[l,j]*log(1-s_j) + (1-ฮท[l,j])*log(g_j)
    log_P_mat = (
        eta_T * log_1_minus_slip[:, None] +
        (1 - eta_T) * log_guess[:, None]
    )  # (J, L)

    P_mat = pt.exp(log_P_mat)  # (J, L)

    # โ”€โ”€ Mixture likelihood๏ผˆๅฐ L ๅ€‹ๅฑฌๆ€ง็ต„ๅˆๅš marginalization๏ผ‰โ”€
    # log p(Y_i | ฯ€, s, g) = log ฮฃ_l ฯ€_l * โˆ_j P_mat[j,l]^Y_ij * (1-P_mat[j,l])^(1-Y_ij)

    log_likes = pt.stack([
        pt.log(pi[l] + 1e-8) +
        pt.dot(Y_pt,       pt.log(P_mat[:, l] + 1e-8)) +
        pt.dot(1 - Y_pt,   pt.log(1 - P_mat[:, l] + 1e-8))
        for l in range(L)
    ], axis=1)  # (N, L)

    pm.Potential("obs", pt.logsumexp(log_likes, axis=1).sum())

    # โ”€โ”€ VI ไผฐ่จˆ๏ผˆADVI๏ผ‰โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
    """
    approx = pm.fit(20000,
                    #method="advi",
                    method="fullrank_advi",
                    callbacks=[pm.callbacks.CheckParametersConvergence(tolerance=1e-4)])

    trace = approx.sample(2000)
    """
    #trace = pm.sample(cores=2, target_accept=0.85, init="advi+adapt_diag")


    # -- nutpie

    compiled = nutpie.compile_pymc_model(dina_model)
    trace = nutpie.sample(compiled, chains=2, cores=2, target_accept=0.85)
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• # PART 3๏ผšPyMC DINA๏ผˆMCMC ไผฐ่จˆ๏ผ‰ # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• # # DINA ๆจกๅž‹๏ผš # P(Y_ij=1 | ฮฑ_i) = (1 - s_j)^ฮท_ij * g_j^(1 - ฮท_ij) # # ๅ…ˆ้ฉ—่จญๅฎš๏ผˆๅƒ่€ƒ่ซ–ๆ–‡ de la Torre 2009; Junker & Sijtsma 2001๏ผ‰๏ผš # slip_j ~ Beta(1, 4) โ†’ ๅๅ‘ๅฐๅ€ผ๏ผˆslip ้€šๅธธ่ผƒๅฐ๏ผ‰ # guess_j ~ Beta(1, 4) โ†’ ๅๅ‘ๅฐๅ€ผ๏ผˆguess ้€šๅธธ่ผƒๅฐ๏ผ‰ # ฯ€ ~ Dirichlet(1, ..., 1) โ†’ ๅ‡ๅ‹ปๅ…ˆ้ฉ— # # ๅฏ่ญ˜ๅˆฅๆ€ง็ด„ๆŸ๏ผšs_j + g_j < 1 ๏ผˆ้€้Žๅ…ˆ้ฉ—ๅฝข็‹€้šฑๅผๆŽงๅˆถ๏ผ‰ eta_pt = pt.constant(eta) # (L, J) Y_pt = pt.constant(Y.astype(np.float32)) # (N, J) with pm.Model() as dina_model: # โ”€โ”€ ๅ…ˆ้ฉ— โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ # ๅฑฌๆ€ง็ต„ๅˆ็š„ๆททๅˆๆฏ”ไพ‹ pi = pm.Dirichlet("pi", a=np.ones(L)) # (L,) # ๆฏ้กŒ็š„ slip ่ˆ‡ guess ๅƒๆ•ธ # Beta(1,4) ๅ…ˆ้ฉ—่ฎ“ๅƒๆ•ธๅๅ‘ 0๏ผŒ็ฌฆๅˆ DINA ็š„่ญ˜ๅˆฅๆขไปถ slip = pm.Beta("slip", alpha=1, beta=4, shape=(J,)) # (J,) guess = pm.Beta("guess", alpha=1, beta=4, shape=(J,)) # (J,) # โ”€โ”€ DINA ๆˆๅŠŸๆฆ‚็އ็Ÿฉ้™ฃ P_mat (J, L) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ # P_mat[j, l] = (1 - slip[j])^ฮท[l,j] * guess[j]^(1 - ฮท[l,j]) # ๅˆฉ็”จ log-sum trick ่จˆ็ฎ— # log P = ฮท[l,j] * log(1-slip[j]) + (1-ฮท[l,j]) * log(guess[j]) log_1_minus_slip = pt.log(1 - slip + 1e-8) # (J,) log_guess = pt.log(guess + 1e-8) # (J,) # eta_pt: (L, J) โ†’ ่ฝ‰็ฝฎๆˆ (J, L) ไปฅไพฟๅปฃๆ’ญ eta_T = eta_pt.T # (J, L) # log P_mat[j, l] = ฮท[l,j]*log(1-s_j) + (1-ฮท[l,j])*log(g_j) log_P_mat = ( eta_T * log_1_minus_slip[:, None] + (1 - eta_T) * log_guess[:, None] ) # (J, L) P_mat = pt.exp(log_P_mat) # (J, L) # โ”€โ”€ Mixture likelihood๏ผˆๅฐ L ๅ€‹ๅฑฌๆ€ง็ต„ๅˆๅš marginalization๏ผ‰โ”€ # log p(Y_i | ฯ€, s, g) = log ฮฃ_l ฯ€_l * โˆ_j P_mat[j,l]^Y_ij * (1-P_mat[j,l])^(1-Y_ij) log_likes = pt.stack([ pt.log(pi[l] + 1e-8) + pt.dot(Y_pt, pt.log(P_mat[:, l] + 1e-8)) + pt.dot(1 - Y_pt, pt.log(1 - P_mat[:, l] + 1e-8)) for l in range(L) ], axis=1) # (N, L) pm.Potential("obs", pt.logsumexp(log_likes, axis=1).sum()) # โ”€โ”€ VI ไผฐ่จˆ๏ผˆADVI๏ผ‰โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ """ approx = pm.fit(20000, #method="advi", method="fullrank_advi", callbacks=[pm.callbacks.CheckParametersConvergence(tolerance=1e-4)]) trace = approx.sample(2000) """ #trace = pm.sample(cores=2, target_accept=0.85, init="advi+adapt_diag") # -- nutpie compiled = nutpie.compile_pymc_model(dina_model) trace = nutpie.sample(compiled, chains=2, cores=2, target_accept=0.85)

Sampler Progress

Total Chains: 2

Active Chains: 0

Finished Chains: 2

Sampling for a minute

Estimated Time to Completion: now

Progress Draws Divergences Step Size Gradients/Draw
1400 0 0.44 7
1400 0 0.35 15
Inย [ย ]:
Copied!
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# PART 4๏ผšๅŸบๆœฌๅƒๆ•ธๅ›žๆ”ถ็ตๆžœ
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•

# ๆๅ–ๅพŒ้ฉ—ๅ‡ๅ€ผ
slip_est  = trace.posterior["slip"].values.reshape(-1, J).mean(axis=0)   # (J,)
guess_est = trace.posterior["guess"].values.reshape(-1, J).mean(axis=0)  # (J,)
pi_est    = trace.posterior["pi"].values.reshape(-1, L).mean(axis=0)     # (L,)

# Pearson r
r_slip,  _ = pearsonr(slip_true,  slip_est)
r_guess, _ = pearsonr(guess_true, guess_est)
r_pi,    _ = pearsonr(pi_true,    pi_est)

print("ๅƒๆ•ธๅ›žๆ”ถ๏ผˆPearson r๏ผ‰๏ผš")
print(f"  slip   r = {r_slip:.4f}")
print(f"  guess  r = {r_guess:.4f}")
print(f"  ฯ€      r = {r_pi:.4f}")
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• # PART 4๏ผšๅŸบๆœฌๅƒๆ•ธๅ›žๆ”ถ็ตๆžœ # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• # ๆๅ–ๅพŒ้ฉ—ๅ‡ๅ€ผ slip_est = trace.posterior["slip"].values.reshape(-1, J).mean(axis=0) # (J,) guess_est = trace.posterior["guess"].values.reshape(-1, J).mean(axis=0) # (J,) pi_est = trace.posterior["pi"].values.reshape(-1, L).mean(axis=0) # (L,) # Pearson r r_slip, _ = pearsonr(slip_true, slip_est) r_guess, _ = pearsonr(guess_true, guess_est) r_pi, _ = pearsonr(pi_true, pi_est) print("ๅƒๆ•ธๅ›žๆ”ถ๏ผˆPearson r๏ผ‰๏ผš") print(f" slip r = {r_slip:.4f}") print(f" guess r = {r_guess:.4f}") print(f" ฯ€ r = {r_pi:.4f}")
ๅƒๆ•ธๅ›žๆ”ถ๏ผˆPearson r๏ผ‰๏ผš
  slip   r = 0.8583
  guess  r = 0.9777
  ฯ€      r = 0.9914
Inย [ย ]:
Copied!
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# PART 5๏ผšๅ„ๅƒๆ•ธไผฐ่จˆ่กจ
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•

profile_labels = ["".join(map(str, a)) for a in all_alpha]

# โ”€โ”€ 5a. slip ่ˆ‡ guess๏ผš็œŸๅฏฆ vs ไผฐ่จˆ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
slip_sd  = trace.posterior["slip"].values.reshape(-1, J).std(axis=0)
guess_sd = trace.posterior["guess"].values.reshape(-1, J).std(axis=0)

df_sg = pd.DataFrame({
    "้กŒ็›ฎ":       [f"Q{j+1}" for j in range(J)],
    "slip ็œŸๅฏฆ":  slip_true.round(3),
    "slip ไผฐ่จˆ":  slip_est.round(3),
    "slip SD":    slip_sd.round(3),
    "slip ๅๅทฎ":  (slip_est - slip_true).round(3),
    "guess ็œŸๅฏฆ": guess_true.round(3),
    "guess ไผฐ่จˆ": guess_est.round(3),
    "guess SD":   guess_sd.round(3),
    "guess ๅๅทฎ": (guess_est - guess_true).round(3),
})
print("\n" + "="*70)
print("ใ€DINA ๅƒๆ•ธ๏ผšslip & guess โ€” ็œŸๅฏฆ vs ไผฐ่จˆใ€‘")
print("="*70)
print(df_sg.to_string(index=False))

# ๅนณๅ‡็ต•ๅฐๅๅทฎ
mab_slip  = np.abs(slip_est  - slip_true).mean()
mab_guess = np.abs(guess_est - guess_true).mean()
print(f"\n  slip  MAB = {mab_slip:.4f}")
print(f"  guess MAB = {mab_guess:.4f}")
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• # PART 5๏ผšๅ„ๅƒๆ•ธไผฐ่จˆ่กจ # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• profile_labels = ["".join(map(str, a)) for a in all_alpha] # โ”€โ”€ 5a. slip ่ˆ‡ guess๏ผš็œŸๅฏฆ vs ไผฐ่จˆ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ slip_sd = trace.posterior["slip"].values.reshape(-1, J).std(axis=0) guess_sd = trace.posterior["guess"].values.reshape(-1, J).std(axis=0) df_sg = pd.DataFrame({ "้กŒ็›ฎ": [f"Q{j+1}" for j in range(J)], "slip ็œŸๅฏฆ": slip_true.round(3), "slip ไผฐ่จˆ": slip_est.round(3), "slip SD": slip_sd.round(3), "slip ๅๅทฎ": (slip_est - slip_true).round(3), "guess ็œŸๅฏฆ": guess_true.round(3), "guess ไผฐ่จˆ": guess_est.round(3), "guess SD": guess_sd.round(3), "guess ๅๅทฎ": (guess_est - guess_true).round(3), }) print("\n" + "="*70) print("ใ€DINA ๅƒๆ•ธ๏ผšslip & guess โ€” ็œŸๅฏฆ vs ไผฐ่จˆใ€‘") print("="*70) print(df_sg.to_string(index=False)) # ๅนณๅ‡็ต•ๅฐๅๅทฎ mab_slip = np.abs(slip_est - slip_true).mean() mab_guess = np.abs(guess_est - guess_true).mean() print(f"\n slip MAB = {mab_slip:.4f}") print(f" guess MAB = {mab_guess:.4f}")
======================================================================
ใ€DINA ๅƒๆ•ธ๏ผšslip & guess โ€” ็œŸๅฏฆ vs ไผฐ่จˆใ€‘
======================================================================
 ้กŒ็›ฎ  slip ็œŸๅฏฆ  slip ไผฐ่จˆ  slip SD  slip ๅๅทฎ  guess ็œŸๅฏฆ  guess ไผฐ่จˆ  guess SD  guess ๅๅทฎ
 Q1     0.10    0.110    0.010    0.010      0.15     0.167     0.017     0.017
 Q2     0.15    0.170    0.012    0.020      0.20     0.205     0.019     0.005
 Q3     0.10    0.112    0.010    0.012      0.10     0.128     0.015     0.028
 Q4     0.15    0.147    0.011   -0.003      0.25     0.254     0.020     0.004
 Q5     0.10    0.100    0.010   -0.000      0.15     0.165     0.018     0.015
 Q6     0.15    0.152    0.012    0.002      0.20     0.192     0.018    -0.008
 Q7     0.10    0.101    0.010    0.001      0.10     0.120     0.016     0.020
 Q8     0.15    0.144    0.012   -0.006      0.25     0.248     0.020    -0.002
 Q9     0.10    0.105    0.011    0.005      0.15     0.147     0.016    -0.003
Q10     0.15    0.140    0.012   -0.010      0.20     0.199     0.017    -0.001
Q11     0.10    0.088    0.010   -0.012      0.10     0.105     0.013     0.005
Q12     0.15    0.141    0.011   -0.009      0.25     0.258     0.019     0.008
Q13     0.10    0.105    0.012    0.005      0.10     0.100     0.011     0.000
Q14     0.15    0.127    0.013   -0.023      0.15     0.135     0.012    -0.015
Q15     0.10    0.091    0.011   -0.009      0.10     0.112     0.012     0.012
Q16     0.10    0.118    0.013    0.018      0.10     0.087     0.010    -0.013
Q17     0.15    0.141    0.014   -0.009      0.15     0.151     0.012     0.001
Q18     0.10    0.080    0.010   -0.020      0.10     0.100     0.010     0.000
Q19     0.10    0.131    0.015    0.031      0.10     0.087     0.009    -0.013
Q20     0.15    0.166    0.017    0.016      0.15     0.162     0.011     0.012

  slip  MAB = 0.0109
  guess MAB = 0.0092
Inย [ย ]:
Copied!
# โ”€โ”€ 5b. ็”ฑ slip/guess ้‡ๅปบ P ็Ÿฉ้™ฃ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# P_est[j, l] = (1 - slip_est[j])^ฮท[l,j] * guess_est[j]^(1 - ฮท[l,j])

P_est = np.zeros((J, L))
for j in range(J):
    for l in range(L):
        if eta[l, j] == 1:
            P_est[j, l] = 1 - slip_est[j]
        else:
            P_est[j, l] = guess_est[j]

df_P_true = pd.DataFrame(P_true,
                          index=[f"Q{j+1}" for j in range(J)],
                          columns=profile_labels)
df_P_est  = pd.DataFrame(P_est,
                          index=[f"Q{j+1}" for j in range(J)],
                          columns=profile_labels)

print("\n" + "="*60)
print("ใ€ๅ„้กŒๆˆๅŠŸๆฆ‚็އ P โ€” ็œŸๅฏฆๅ€ผใ€‘")
print("  ๏ผˆๆฏ้กŒๅชๆœ‰ 2 ็จฎไธๅŒๅ€ผ๏ผšguess vs 1-slip๏ผ‰")
print("="*60)
print(df_P_true.round(3).to_string())

print("\n" + "="*60)
print("ใ€ๅ„้กŒๆˆๅŠŸๆฆ‚็އ P โ€” PyMC DINA ไผฐ่จˆๅ€ผ๏ผˆๅพŒ้ฉ—ๅ‡ๅ€ผ๏ผ‰ใ€‘")
print("="*60)
print(df_P_est.round(3).to_string())

df_P_diff = df_P_est - df_P_true
mab = df_P_diff.abs().values.mean()
rmse = np.sqrt((df_P_diff.values**2).mean())
print(f"\n  P ็Ÿฉ้™ฃ MAB  = {mab:.4f}")
print(f"  P ็Ÿฉ้™ฃ RMSE = {rmse:.4f}")
# โ”€โ”€ 5b. ็”ฑ slip/guess ้‡ๅปบ P ็Ÿฉ้™ฃ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ # P_est[j, l] = (1 - slip_est[j])^ฮท[l,j] * guess_est[j]^(1 - ฮท[l,j]) P_est = np.zeros((J, L)) for j in range(J): for l in range(L): if eta[l, j] == 1: P_est[j, l] = 1 - slip_est[j] else: P_est[j, l] = guess_est[j] df_P_true = pd.DataFrame(P_true, index=[f"Q{j+1}" for j in range(J)], columns=profile_labels) df_P_est = pd.DataFrame(P_est, index=[f"Q{j+1}" for j in range(J)], columns=profile_labels) print("\n" + "="*60) print("ใ€ๅ„้กŒๆˆๅŠŸๆฆ‚็އ P โ€” ็œŸๅฏฆๅ€ผใ€‘") print(" ๏ผˆๆฏ้กŒๅชๆœ‰ 2 ็จฎไธๅŒๅ€ผ๏ผšguess vs 1-slip๏ผ‰") print("="*60) print(df_P_true.round(3).to_string()) print("\n" + "="*60) print("ใ€ๅ„้กŒๆˆๅŠŸๆฆ‚็އ P โ€” PyMC DINA ไผฐ่จˆๅ€ผ๏ผˆๅพŒ้ฉ—ๅ‡ๅ€ผ๏ผ‰ใ€‘") print("="*60) print(df_P_est.round(3).to_string()) df_P_diff = df_P_est - df_P_true mab = df_P_diff.abs().values.mean() rmse = np.sqrt((df_P_diff.values**2).mean()) print(f"\n P ็Ÿฉ้™ฃ MAB = {mab:.4f}") print(f" P ็Ÿฉ้™ฃ RMSE = {rmse:.4f}")
============================================================
ใ€ๅ„้กŒๆˆๅŠŸๆฆ‚็އ P โ€” ็œŸๅฏฆๅ€ผใ€‘
  ๏ผˆๆฏ้กŒๅชๆœ‰ 2 ็จฎไธๅŒๅ€ผ๏ผšguess vs 1-slip๏ผ‰
============================================================
      000   001   010   011   100   101   110   111
Q1   0.15  0.15  0.15  0.15  0.90  0.90  0.90  0.90
Q2   0.20  0.20  0.20  0.20  0.85  0.85  0.85  0.85
Q3   0.10  0.10  0.10  0.10  0.90  0.90  0.90  0.90
Q4   0.25  0.25  0.25  0.25  0.85  0.85  0.85  0.85
Q5   0.15  0.15  0.90  0.90  0.15  0.15  0.90  0.90
Q6   0.20  0.20  0.85  0.85  0.20  0.20  0.85  0.85
Q7   0.10  0.10  0.90  0.90  0.10  0.10  0.90  0.90
Q8   0.25  0.25  0.85  0.85  0.25  0.25  0.85  0.85
Q9   0.15  0.90  0.15  0.90  0.15  0.90  0.15  0.90
Q10  0.20  0.85  0.20  0.85  0.20  0.85  0.20  0.85
Q11  0.10  0.90  0.10  0.90  0.10  0.90  0.10  0.90
Q12  0.25  0.85  0.25  0.85  0.25  0.85  0.25  0.85
Q13  0.10  0.10  0.10  0.10  0.10  0.10  0.90  0.90
Q14  0.15  0.15  0.15  0.15  0.15  0.15  0.85  0.85
Q15  0.10  0.10  0.10  0.10  0.10  0.10  0.90  0.90
Q16  0.10  0.10  0.10  0.10  0.10  0.90  0.10  0.90
Q17  0.15  0.15  0.15  0.15  0.15  0.85  0.15  0.85
Q18  0.10  0.10  0.10  0.10  0.10  0.90  0.10  0.90
Q19  0.10  0.10  0.10  0.10  0.10  0.10  0.10  0.90
Q20  0.15  0.15  0.15  0.15  0.15  0.15  0.15  0.85

============================================================
ใ€ๅ„้กŒๆˆๅŠŸๆฆ‚็އ P โ€” PyMC DINA ไผฐ่จˆๅ€ผ๏ผˆๅพŒ้ฉ—ๅ‡ๅ€ผ๏ผ‰ใ€‘
============================================================
       000    001    010    011    100    101    110    111
Q1   0.167  0.167  0.167  0.167  0.890  0.890  0.890  0.890
Q2   0.205  0.205  0.205  0.205  0.830  0.830  0.830  0.830
Q3   0.128  0.128  0.128  0.128  0.888  0.888  0.888  0.888
Q4   0.254  0.254  0.254  0.254  0.853  0.853  0.853  0.853
Q5   0.165  0.165  0.900  0.900  0.165  0.165  0.900  0.900
Q6   0.192  0.192  0.848  0.848  0.192  0.192  0.848  0.848
Q7   0.120  0.120  0.899  0.899  0.120  0.120  0.899  0.899
Q8   0.248  0.248  0.856  0.856  0.248  0.248  0.856  0.856
Q9   0.147  0.895  0.147  0.895  0.147  0.895  0.147  0.895
Q10  0.199  0.860  0.199  0.860  0.199  0.860  0.199  0.860
Q11  0.105  0.912  0.105  0.912  0.105  0.912  0.105  0.912
Q12  0.258  0.859  0.258  0.859  0.258  0.859  0.258  0.859
Q13  0.100  0.100  0.100  0.100  0.100  0.100  0.895  0.895
Q14  0.135  0.135  0.135  0.135  0.135  0.135  0.873  0.873
Q15  0.112  0.112  0.112  0.112  0.112  0.112  0.909  0.909
Q16  0.087  0.087  0.087  0.087  0.087  0.882  0.087  0.882
Q17  0.151  0.151  0.151  0.151  0.151  0.859  0.151  0.859
Q18  0.100  0.100  0.100  0.100  0.100  0.920  0.100  0.920
Q19  0.087  0.087  0.087  0.087  0.087  0.087  0.087  0.869
Q20  0.162  0.162  0.162  0.162  0.162  0.162  0.162  0.834

  P ็Ÿฉ้™ฃ MAB  = 0.0092
  P ็Ÿฉ้™ฃ RMSE = 0.0116
Inย [ย ]:
Copied!
# โ”€โ”€ 5c. ฯ€๏ผš็œŸๅฏฆ vs ไผฐ่จˆ๏ผˆๅซ 95% HDI๏ผ‰โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
pi_samples  = trace.posterior["pi"].values.reshape(-1, L)
pi_hdi_low  = np.percentile(pi_samples, 2.5,  axis=0)
pi_hdi_high = np.percentile(pi_samples, 97.5, axis=0)

df_pi = pd.DataFrame({
    "ๅฑฌๆ€ง็ต„ๅˆ":   profile_labels,
    "็œŸๅฏฆ ฯ€":     pi_true.round(4),
    "ไผฐ่จˆ ฯ€":     pi_est.round(4),
    "SD":         pi_samples.std(axis=0).round(4),
    "95% HDI ไธ‹": pi_hdi_low.round(4),
    "95% HDI ไธŠ": pi_hdi_high.round(4),
    "ๅๅทฎ":       (pi_est - pi_true).round(4),
})
print("\n" + "="*60)
print("ใ€ๅฑฌๆ€ง็ต„ๅˆๆฏ”ไพ‹ ฯ€ โ€” ็œŸๅฏฆ vs ไผฐ่จˆใ€‘")
print("="*60)
print(df_pi.to_string(index=False))
# โ”€โ”€ 5c. ฯ€๏ผš็œŸๅฏฆ vs ไผฐ่จˆ๏ผˆๅซ 95% HDI๏ผ‰โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ pi_samples = trace.posterior["pi"].values.reshape(-1, L) pi_hdi_low = np.percentile(pi_samples, 2.5, axis=0) pi_hdi_high = np.percentile(pi_samples, 97.5, axis=0) df_pi = pd.DataFrame({ "ๅฑฌๆ€ง็ต„ๅˆ": profile_labels, "็œŸๅฏฆ ฯ€": pi_true.round(4), "ไผฐ่จˆ ฯ€": pi_est.round(4), "SD": pi_samples.std(axis=0).round(4), "95% HDI ไธ‹": pi_hdi_low.round(4), "95% HDI ไธŠ": pi_hdi_high.round(4), "ๅๅทฎ": (pi_est - pi_true).round(4), }) print("\n" + "="*60) print("ใ€ๅฑฌๆ€ง็ต„ๅˆๆฏ”ไพ‹ ฯ€ โ€” ็œŸๅฏฆ vs ไผฐ่จˆใ€‘") print("="*60) print(df_pi.to_string(index=False))
============================================================
ใ€ๅฑฌๆ€ง็ต„ๅˆๆฏ”ไพ‹ ฯ€ โ€” ็œŸๅฏฆ vs ไผฐ่จˆใ€‘
============================================================
ๅฑฌๆ€ง็ต„ๅˆ  ็œŸๅฏฆ ฯ€   ไผฐ่จˆ ฯ€     SD  95% HDI ไธ‹  95% HDI ไธŠ      ๅๅทฎ
 000  0.05 0.0606 0.0070     0.0475     0.0746  0.0106
 001  0.08 0.0837 0.0080     0.0692     0.1005  0.0037
 010  0.08 0.0880 0.0083     0.0726     0.1049  0.0080
 011  0.12 0.1000 0.0085     0.0839     0.1169 -0.0200
 100  0.10 0.0993 0.0081     0.0834     0.1152 -0.0007
 101  0.12 0.1184 0.0083     0.1031     0.1349 -0.0016
 110  0.15 0.1420 0.0092     0.1247     0.1608 -0.0080
 111  0.30 0.3081 0.0117     0.2849     0.3314  0.0081
Inย [ย ]:
Copied!
# โ”€โ”€ 5d. slip/guess ๅพŒ้ฉ—ๅˆ†ไฝˆ๏ผˆ95% HDI๏ผ‰โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
slip_samples  = trace.posterior["slip"].values.reshape(-1, J)
guess_samples = trace.posterior["guess"].values.reshape(-1, J)

slip_hdi_low   = np.percentile(slip_samples,  2.5,  axis=0)
slip_hdi_high  = np.percentile(slip_samples,  97.5, axis=0)
guess_hdi_low  = np.percentile(guess_samples, 2.5,  axis=0)
guess_hdi_high = np.percentile(guess_samples, 97.5, axis=0)

df_hdi = pd.DataFrame({
    "้กŒ็›ฎ":          [f"Q{j+1}" for j in range(J)],
    "slip ็œŸๅฏฆ":     slip_true.round(3),
    "slip ไผฐ่จˆ":     slip_est.round(3),
    "slip 95%ไธ‹":   slip_hdi_low.round(3),
    "slip 95%ไธŠ":   slip_hdi_high.round(3),
    "guess ็œŸๅฏฆ":    guess_true.round(3),
    "guess ไผฐ่จˆ":    guess_est.round(3),
    "guess 95%ไธ‹":  guess_hdi_low.round(3),
    "guess 95%ไธŠ":  guess_hdi_high.round(3),
})
print("\n" + "="*80)
print("ใ€slip & guess โ€” ๅพŒ้ฉ—ๅ‡ๅ€ผ่ˆ‡ 95% HDIใ€‘")
print("="*80)
print(df_hdi.to_string(index=False))
# โ”€โ”€ 5d. slip/guess ๅพŒ้ฉ—ๅˆ†ไฝˆ๏ผˆ95% HDI๏ผ‰โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ slip_samples = trace.posterior["slip"].values.reshape(-1, J) guess_samples = trace.posterior["guess"].values.reshape(-1, J) slip_hdi_low = np.percentile(slip_samples, 2.5, axis=0) slip_hdi_high = np.percentile(slip_samples, 97.5, axis=0) guess_hdi_low = np.percentile(guess_samples, 2.5, axis=0) guess_hdi_high = np.percentile(guess_samples, 97.5, axis=0) df_hdi = pd.DataFrame({ "้กŒ็›ฎ": [f"Q{j+1}" for j in range(J)], "slip ็œŸๅฏฆ": slip_true.round(3), "slip ไผฐ่จˆ": slip_est.round(3), "slip 95%ไธ‹": slip_hdi_low.round(3), "slip 95%ไธŠ": slip_hdi_high.round(3), "guess ็œŸๅฏฆ": guess_true.round(3), "guess ไผฐ่จˆ": guess_est.round(3), "guess 95%ไธ‹": guess_hdi_low.round(3), "guess 95%ไธŠ": guess_hdi_high.round(3), }) print("\n" + "="*80) print("ใ€slip & guess โ€” ๅพŒ้ฉ—ๅ‡ๅ€ผ่ˆ‡ 95% HDIใ€‘") print("="*80) print(df_hdi.to_string(index=False))
================================================================================
ใ€slip & guess โ€” ๅพŒ้ฉ—ๅ‡ๅ€ผ่ˆ‡ 95% HDIใ€‘
================================================================================
 ้กŒ็›ฎ  slip ็œŸๅฏฆ  slip ไผฐ่จˆ  slip 95%ไธ‹  slip 95%ไธŠ  guess ็œŸๅฏฆ  guess ไผฐ่จˆ  guess 95%ไธ‹  guess 95%ไธŠ
 Q1     0.10    0.110      0.090      0.131      0.15     0.167       0.134       0.201
 Q2     0.15    0.170      0.147      0.196      0.20     0.205       0.169       0.243
 Q3     0.10    0.112      0.092      0.132      0.10     0.128       0.098       0.158
 Q4     0.15    0.147      0.124      0.170      0.25     0.254       0.214       0.295
 Q5     0.10    0.100      0.081      0.120      0.15     0.165       0.132       0.204
 Q6     0.15    0.152      0.129      0.177      0.20     0.192       0.158       0.230
 Q7     0.10    0.101      0.082      0.121      0.10     0.120       0.091       0.150
 Q8     0.15    0.144      0.121      0.166      0.25     0.248       0.210       0.287
 Q9     0.10    0.105      0.086      0.127      0.15     0.147       0.117       0.180
Q10     0.15    0.140      0.117      0.165      0.20     0.199       0.167       0.234
Q11     0.10    0.088      0.070      0.108      0.10     0.105       0.080       0.132
Q12     0.15    0.141      0.120      0.163      0.25     0.258       0.223       0.295
Q13     0.10    0.105      0.082      0.130      0.10     0.100       0.081       0.122
Q14     0.15    0.127      0.102      0.155      0.15     0.135       0.112       0.160
Q15     0.10    0.091      0.069      0.115      0.10     0.112       0.091       0.135
Q16     0.10    0.118      0.094      0.145      0.10     0.087       0.069       0.107
Q17     0.15    0.141      0.115      0.170      0.15     0.151       0.128       0.176
Q18     0.10    0.080      0.061      0.102      0.10     0.100       0.081       0.122
Q19     0.10    0.131      0.100      0.163      0.10     0.087       0.070       0.105
Q20     0.15    0.166      0.133      0.202      0.15     0.162       0.142       0.185
Inย [ย ]:
Copied!
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# PART 6๏ผšๅ„ไบบๅˆ†้กž็ตๆžœ๏ผˆMAP ่ˆ‡ EAP๏ผ‰
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•

# โ”€โ”€ 6a. ่จˆ็ฎ—ๅพŒ้ฉ— ฮณ๏ผˆๅ„ไบบๅฐๆฏๅ€‹ๅฑฌๆ€ง็ต„ๅˆ็š„ๅพŒ้ฉ—ๆฆ‚็އ๏ผ‰โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# ไฝฟ็”จไผฐ่จˆ็š„ slip_estใ€guess_estใ€pi_est ๅšๅ‰ๅ‘่จˆ็ฎ—
log_gamma = np.zeros((N, L))
for l in range(L):
    # P(Y_i | ฮฑ_l)
    log_p_given_l = np.zeros(N)
    for j in range(J):
        if eta[l, j] == 1:
            p_jl = 1 - slip_est[j]
        else:
            p_jl = guess_est[j]
        log_p_given_l += (
            Y[:, j] * np.log(p_jl + 1e-10) +
            (1 - Y[:, j]) * np.log(1 - p_jl + 1e-10)
        )
    log_gamma[:, l] = np.log(pi_est[l] + 1e-10) + log_p_given_l

log_norm = logsumexp(log_gamma, axis=1, keepdims=True)
gamma    = np.exp(log_gamma - log_norm)  # (N, L) ๅพŒ้ฉ—ๆฆ‚็އ

# โ”€โ”€ 6b. MAP ๅˆ†้กž โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
map_idx      = np.argmax(gamma, axis=1)
map_profiles = all_alpha[map_idx]

# โ”€โ”€ 6c. EAP๏ผˆๅ„ๅฑฌๆ€ง็š„ๆœŸๆœ›ๆŽŒๆกๆฆ‚็އ๏ผ‰โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
eap = gamma @ all_alpha  # (N, K)

# โ”€โ”€ 6d. ็ต„ๅˆๅˆ†้กž็ตๆžœ DataFrame โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
df_class = pd.DataFrame({
    "็œŸๅฏฆๅฑฌๆ€ง็ต„ๅˆ":  ["".join(map(str, all_alpha[i])) for i in alpha_idx],
    "MAP้ ๆธฌ็ต„ๅˆ":   ["".join(map(str, p)) for p in map_profiles],
    "ๅˆ†้กžๆญฃ็ขบ":      alpha_idx == map_idx,
    "MAPๅพŒ้ฉ—ๆฆ‚็އ":   gamma[np.arange(N), map_idx].round(3),
    "EAP_ฮฑ1":       eap[:, 0].round(3),
    "EAP_ฮฑ2":       eap[:, 1].round(3),
    "EAP_ฮฑ3":       eap[:, 2].round(3),
})
for l, label in enumerate(profile_labels):
    df_class[f"P({label})"] = gamma[:, l].round(3)

print("\n" + "="*60)
print("ใ€ๅ„ไบบๅˆ†้กž็ตๆžœ๏ผˆๅ‰ 15 ไบบ๏ผ‰ใ€‘")
print("="*60)
print(df_class.head(15).to_string(index=True))

# โ”€โ”€ 6e. ๆ•ด้ซ”ๅˆ†้กžๆบ–็ขบ็އ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
acc = df_class["ๅˆ†้กžๆญฃ็ขบ"].mean()
print(f"\n  MAP ๆ•ด้ซ”ๅˆ†้กžๆบ–็ขบ็އ = {acc:.4f} ({acc*100:.1f}%)")

# โ”€โ”€ 6f. ๅ„ๅฑฌๆ€ง็ต„ๅˆ็š„ๅˆ†้กžๆบ–็ขบ็އ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
print("\n  ๅ„ๅฑฌๆ€ง็ต„ๅˆๅˆ†้กžๆบ–็ขบ็އ๏ผš")
summary = df_class.groupby("็œŸๅฏฆๅฑฌๆ€ง็ต„ๅˆ").agg(
    ไบบๆ•ธ=("ๅˆ†้กžๆญฃ็ขบ", "count"),
    ๆญฃ็ขบๆ•ธ=("ๅˆ†้กžๆญฃ็ขบ", "sum"),
    ๆบ–็ขบ็އ=("ๅˆ†้กžๆญฃ็ขบ", "mean"),
    ๅนณๅ‡MAPๅพŒ้ฉ—=("MAPๅพŒ้ฉ—ๆฆ‚็އ", "mean"),
).round(3)
print(summary.to_string())

# โ”€โ”€ 6g. ๆททๆท†็Ÿฉ้™ฃ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
conf = pd.crosstab(
    df_class["็œŸๅฏฆๅฑฌๆ€ง็ต„ๅˆ"],
    df_class["MAP้ ๆธฌ็ต„ๅˆ"],
    rownames=["็œŸๅฏฆ"], colnames=["้ ๆธฌ"]
)
print("\n" + "="*60)
print("ใ€ๆททๆท†็Ÿฉ้™ฃ๏ผˆ่กŒ=็œŸๅฏฆ๏ผŒๅˆ—=้ ๆธฌ๏ผ‰ใ€‘")
print("="*60)
print(conf.to_string())
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• # PART 6๏ผšๅ„ไบบๅˆ†้กž็ตๆžœ๏ผˆMAP ่ˆ‡ EAP๏ผ‰ # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• # โ”€โ”€ 6a. ่จˆ็ฎ—ๅพŒ้ฉ— ฮณ๏ผˆๅ„ไบบๅฐๆฏๅ€‹ๅฑฌๆ€ง็ต„ๅˆ็š„ๅพŒ้ฉ—ๆฆ‚็އ๏ผ‰โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ # ไฝฟ็”จไผฐ่จˆ็š„ slip_estใ€guess_estใ€pi_est ๅšๅ‰ๅ‘่จˆ็ฎ— log_gamma = np.zeros((N, L)) for l in range(L): # P(Y_i | ฮฑ_l) log_p_given_l = np.zeros(N) for j in range(J): if eta[l, j] == 1: p_jl = 1 - slip_est[j] else: p_jl = guess_est[j] log_p_given_l += ( Y[:, j] * np.log(p_jl + 1e-10) + (1 - Y[:, j]) * np.log(1 - p_jl + 1e-10) ) log_gamma[:, l] = np.log(pi_est[l] + 1e-10) + log_p_given_l log_norm = logsumexp(log_gamma, axis=1, keepdims=True) gamma = np.exp(log_gamma - log_norm) # (N, L) ๅพŒ้ฉ—ๆฆ‚็އ # โ”€โ”€ 6b. MAP ๅˆ†้กž โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ map_idx = np.argmax(gamma, axis=1) map_profiles = all_alpha[map_idx] # โ”€โ”€ 6c. EAP๏ผˆๅ„ๅฑฌๆ€ง็š„ๆœŸๆœ›ๆŽŒๆกๆฆ‚็އ๏ผ‰โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ eap = gamma @ all_alpha # (N, K) # โ”€โ”€ 6d. ็ต„ๅˆๅˆ†้กž็ตๆžœ DataFrame โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ df_class = pd.DataFrame({ "็œŸๅฏฆๅฑฌๆ€ง็ต„ๅˆ": ["".join(map(str, all_alpha[i])) for i in alpha_idx], "MAP้ ๆธฌ็ต„ๅˆ": ["".join(map(str, p)) for p in map_profiles], "ๅˆ†้กžๆญฃ็ขบ": alpha_idx == map_idx, "MAPๅพŒ้ฉ—ๆฆ‚็އ": gamma[np.arange(N), map_idx].round(3), "EAP_ฮฑ1": eap[:, 0].round(3), "EAP_ฮฑ2": eap[:, 1].round(3), "EAP_ฮฑ3": eap[:, 2].round(3), }) for l, label in enumerate(profile_labels): df_class[f"P({label})"] = gamma[:, l].round(3) print("\n" + "="*60) print("ใ€ๅ„ไบบๅˆ†้กž็ตๆžœ๏ผˆๅ‰ 15 ไบบ๏ผ‰ใ€‘") print("="*60) print(df_class.head(15).to_string(index=True)) # โ”€โ”€ 6e. ๆ•ด้ซ”ๅˆ†้กžๆบ–็ขบ็އ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ acc = df_class["ๅˆ†้กžๆญฃ็ขบ"].mean() print(f"\n MAP ๆ•ด้ซ”ๅˆ†้กžๆบ–็ขบ็އ = {acc:.4f} ({acc*100:.1f}%)") # โ”€โ”€ 6f. ๅ„ๅฑฌๆ€ง็ต„ๅˆ็š„ๅˆ†้กžๆบ–็ขบ็އ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ print("\n ๅ„ๅฑฌๆ€ง็ต„ๅˆๅˆ†้กžๆบ–็ขบ็އ๏ผš") summary = df_class.groupby("็œŸๅฏฆๅฑฌๆ€ง็ต„ๅˆ").agg( ไบบๆ•ธ=("ๅˆ†้กžๆญฃ็ขบ", "count"), ๆญฃ็ขบๆ•ธ=("ๅˆ†้กžๆญฃ็ขบ", "sum"), ๆบ–็ขบ็އ=("ๅˆ†้กžๆญฃ็ขบ", "mean"), ๅนณๅ‡MAPๅพŒ้ฉ—=("MAPๅพŒ้ฉ—ๆฆ‚็އ", "mean"), ).round(3) print(summary.to_string()) # โ”€โ”€ 6g. ๆททๆท†็Ÿฉ้™ฃ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ conf = pd.crosstab( df_class["็œŸๅฏฆๅฑฌๆ€ง็ต„ๅˆ"], df_class["MAP้ ๆธฌ็ต„ๅˆ"], rownames=["็œŸๅฏฆ"], colnames=["้ ๆธฌ"] ) print("\n" + "="*60) print("ใ€ๆททๆท†็Ÿฉ้™ฃ๏ผˆ่กŒ=็œŸๅฏฆ๏ผŒๅˆ—=้ ๆธฌ๏ผ‰ใ€‘") print("="*60) print(conf.to_string())
============================================================
ใ€ๅ„ไบบๅˆ†้กž็ตๆžœ๏ผˆๅ‰ 15 ไบบ๏ผ‰ใ€‘
============================================================
   ็œŸๅฏฆๅฑฌๆ€ง็ต„ๅˆ MAP้ ๆธฌ็ต„ๅˆ  ๅˆ†้กžๆญฃ็ขบ  MAPๅพŒ้ฉ—ๆฆ‚็އ  EAP_ฮฑ1  EAP_ฮฑ2  EAP_ฮฑ3  P(000)  P(001)  P(010)  P(011)  P(100)  P(101)  P(110)  P(111)
0     010     010  True    0.972   0.000   0.980   0.008   0.019   0.000   0.972   0.008   0.000   0.000     0.0     0.0
1     110     110  True    1.000   1.000   1.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000     1.0     0.0
2     101     101  True    0.984   0.991   0.000   0.993   0.000   0.009   0.000   0.000   0.007   0.984     0.0     0.0
3     111     111  True    1.000   1.000   1.000   1.000   0.000   0.000   0.000   0.000   0.000   0.000     0.0     1.0
4     111     111  True    1.000   1.000   1.000   1.000   0.000   0.000   0.000   0.000   0.000   0.000     0.0     1.0
5     011     011  True    0.906   0.000   0.999   0.908   0.000   0.001   0.092   0.906   0.000   0.000     0.0     0.0
6     011     011  True    0.941   0.000   0.942   0.999   0.000   0.058   0.001   0.941   0.000   0.000     0.0     0.0
7     111     111  True    1.000   1.000   1.000   1.000   0.000   0.000   0.000   0.000   0.000   0.000     0.0     1.0
8     111     111  True    1.000   1.000   1.000   1.000   0.000   0.000   0.000   0.000   0.000   0.000     0.0     1.0
9     111     111  True    1.000   1.000   1.000   1.000   0.000   0.000   0.000   0.000   0.000   0.000     0.0     1.0
10    100     100  True    0.951   0.951   0.000   0.002   0.047   0.002   0.000   0.000   0.951   0.000     0.0     0.0
11    101     101  True    1.000   1.000   0.000   1.000   0.000   0.000   0.000   0.000   0.000   1.000     0.0     0.0
12    110     110  True    1.000   1.000   1.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000     1.0     0.0
13    111     111  True    1.000   1.000   1.000   1.000   0.000   0.000   0.000   0.000   0.000   0.000     0.0     1.0
14    100     100  True    0.979   0.979   0.000   0.000   0.020   0.000   0.000   0.000   0.979   0.000     0.0     0.0

  MAP ๆ•ด้ซ”ๅˆ†้กžๆบ–็ขบ็އ = 0.9560 (95.6%)

  ๅ„ๅฑฌๆ€ง็ต„ๅˆๅˆ†้กžๆบ–็ขบ็އ๏ผš
         ไบบๆ•ธ  ๆญฃ็ขบๆ•ธ    ๆบ–็ขบ็އ  ๅนณๅ‡MAPๅพŒ้ฉ—
็œŸๅฏฆๅฑฌๆ€ง็ต„ๅˆ                          
000      88   77  0.875    0.866
001     127  112  0.882    0.883
010     127  119  0.937    0.922
011     153  135  0.882    0.910
100     145  139  0.959    0.938
101     180  176  0.978    0.969
110     218  214  0.982    0.972
111     462  462  1.000    0.996

============================================================
ใ€ๆททๆท†็Ÿฉ้™ฃ๏ผˆ่กŒ=็œŸๅฏฆ๏ผŒๅˆ—=้ ๆธฌ๏ผ‰ใ€‘
============================================================
้ ๆธฌ   000  001  010  011  100  101  110  111
็œŸๅฏฆ                                         
000   77    2    3    0    5    1    0    0
001    7  112    0    6    1    1    0    0
010    4    0  119    2    0    0    2    0
011    0   11    6  135    1    0    0    0
100    2    1    0    0  139    2    1    0
101    0    2    0    0    0  176    0    2
110    0    1    2    0    1    0  214    0
111    0    0    0    0    0    0    0  462
Previous Next

Built with MkDocs using a theme provided by Read the Docs.